From 01a5188d74ef5ae54d73d892036e5187a158b396 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=EC=A1=B0=ED=95=B4=EC=B0=BD?= Date: Fri, 28 Nov 2025 23:31:40 +0900 Subject: [PATCH 1/5] add: overview --- book/_toc.yml | 3 ++- book/prescriptive_analytics/overview.md | 5 +++++ 2 files changed, 7 insertions(+), 1 deletion(-) create mode 100644 book/prescriptive_analytics/overview.md diff --git a/book/_toc.yml b/book/_toc.yml index 4c35a3e..beb9836 100644 --- a/book/_toc.yml +++ b/book/_toc.yml @@ -23,4 +23,5 @@ parts: sections: - file: cate_and_policy/parametric_cate.ipynb - file: cate_and_policy/nonparametric_cate.ipynb - - file: cate_and_policy/policy_learning.ipynb \ No newline at end of file + - file: cate_and_policy/policy_learning.ipynb + - file: prescriptive_analytics/overview.md \ No newline at end of file diff --git a/book/prescriptive_analytics/overview.md b/book/prescriptive_analytics/overview.md new file mode 100644 index 0000000..8bcc007 --- /dev/null +++ b/book/prescriptive_analytics/overview.md @@ -0,0 +1,5 @@ +# Prescriptive Analytics + +- Prescriptive Analytics는 데이터를 활용해 최적의 의사결정을 도출하는 분석 방식입니다. +- 접근 방식은 크게 **Prediction + Optimization**, **Causal Inference + Optimization** 으로 나눌 수 있습니다. +- 이 섹션에서는 **Causal Inference + Optimization** 에 집중하여, 개입의 인과효과(CATE)를 기반으로 **가장 효율적인 정책·전략을 선택하는 방법**을 다룹니다. From 88ffc8168767cd0825add97db53f5912d3e1d919 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=EC=A1=B0=ED=95=B4=EC=B0=BD?= Date: Sun, 7 Dec 2025 20:44:34 +0900 Subject: [PATCH 2/5] update: duality r-learner --- book/_toc.yml | 4 +- ...rning_for_effectiveness_optimization.ipynb | 1344 +++++++++++++++++ 2 files changed, 1347 insertions(+), 1 deletion(-) create mode 100644 book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb diff --git a/book/_toc.yml b/book/_toc.yml index 04c86ca..4b59776 100644 --- a/book/_toc.yml +++ b/book/_toc.yml @@ -24,4 +24,6 @@ parts: - file: scm/backdoor_criterion.ipynb - file: scm/frontdoor_criterion.ipynb - file: scm/causal_discovery.ipynb - - file: prescriptive_analytics/overview.md \ No newline at end of file + - file: prescriptive_analytics/overview.md + sections: + - file: prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb \ No newline at end of file diff --git a/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb b/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb new file mode 100644 index 0000000..c5b0644 --- /dev/null +++ b/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb @@ -0,0 +1,1344 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "d6b70f64", + "metadata": {}, + "source": [ + "# Heterogeneous Causal Learning for Effectiveness Optimization" + ] + }, + { + "cell_type": "markdown", + "id": "b841ee2c", + "metadata": {}, + "source": [ + "- 기존 uplift 모델은 이질적 처치 효과를 추정할 수 있지만, 비용 대비 이익을 충분히 반영하지 못합니다.\n", + "\n", + "- 마케팅처럼 예산이 제한된 환경에서는, 비용을 고려하면서 효과를 최대화하는 처치 효과 최적화(treatment effect optimization) 접근이 필요합니다.\n", + "\n", + "- 이를 위해 다음 알고리즘들을 활용합니다.\n", + " - Duality R-learner\n", + " - Direct Ranking Model (DRM)\n", + " - Constrained Ranking Models" + ] + }, + { + "cell_type": "markdown", + "id": "21447276", + "metadata": {}, + "source": [ + "## Setup" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "0f91658c", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33mWARNING: There was an error checking the latest version of pip.\u001b[0m\u001b[33m\n", + "\u001b[0mNote: you may need to restart the kernel to use updated packages.\n" + ] + } + ], + "source": [ + "%pip -q install scikit-uplift" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "9114f7da", + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "from sklearn.model_selection import train_test_split\n", + "from sklearn.linear_model import Ridge, LogisticRegression\n", + "from sklearn.metrics import r2_score, roc_auc_score\n", + "\n", + "from sklift.datasets import fetch_hillstrom\n", + "\n", + "import matplotlib.pyplot as plt\n", + "\n", + "RANDOM_STATE = 42\n", + "pd.set_option(\"display.max_columns\", 50)\n", + "\n", + "import warnings\n", + "warnings.filterwarnings(\"ignore\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "57cbb979", + "metadata": {}, + "source": [ + "### Hillstrom E-mail Test Dataset\n", + "\n", + "Kevin Hillstrom E-mail Test Dataset을 사용합니다. \n", + "이 데이터는 e-mail 마케팅 A/B/n 테스트 로그입니다.\n", + "\n", + "- **Treatment**: ${T}$\n", + " - `Mens E-Mail`, `Womens E-Mail` $\\Rightarrow$ ${T = 1}$ (이메일 발송)\n", + " - `No E-Mail` $\\Rightarrow$ ${T = 0}$ (대조군)\n", + "\n", + "- **Gain outcome**: ${Y^r}$\n", + " - 2주간 지출 금액 `spend`\n", + " - “이메일을 보내면 spend가 얼마나 증가하는가?” 가 관심\n", + "\n", + "- **Cost outcome**: ${Y^c}$\n", + " - 이메일 발송 1회당 비용을 1 단위로 단순화\n", + " - 따라서 $Y^c = T \\in \\{0,1\\}$\n" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "b2b3d7a2", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "data shape: (64000, 8)\n", + "\n", + "spend (target) describe:\n", + "count 64000.000000\n", + "mean 1.050908\n", + "std 15.036448\n", + "min 0.000000\n", + "25% 0.000000\n", + "50% 0.000000\n", + "75% 0.000000\n", + "max 499.000000\n", + "Name: spend, dtype: float64\n", + "\n", + "segment (treatment_raw) 분포:\n", + "segment\n", + "Womens E-Mail 0.334172\n", + "Mens E-Mail 0.332922\n", + "No E-Mail 0.332906\n", + "Name: proportion, dtype: float64\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
recencyhistory_segmenthistorymenswomenszip_codenewbiechannel
0102) $100 - $200142.4410Surburban0Phone
163) $200 - $350329.0811Rural1Web
272) $100 - $200180.6501Surburban1Web
395) $500 - $750675.8310Rural1Web
421) $0 - $10045.3410Urban0Web
\n", + "
" + ], + "text/plain": [ + " recency history_segment history mens womens zip_code newbie channel\n", + "0 10 2) $100 - $200 142.44 1 0 Surburban 0 Phone\n", + "1 6 3) $200 - $350 329.08 1 1 Rural 1 Web\n", + "2 7 2) $100 - $200 180.65 0 1 Surburban 1 Web\n", + "3 9 5) $500 - $750 675.83 1 0 Rural 1 Web\n", + "4 2 1) $0 - $100 45.34 1 0 Urban 0 Web" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "dataset = fetch_hillstrom(target_col=\"spend\", return_X_y_t=False)\n", + "\n", + "data = dataset.data.copy() # X (features, 아직 전처리 전)\n", + "y_gain = dataset.target.copy() # Y^r = spend\n", + "treatment_raw = dataset.treatment.copy() # 'Mens E-Mail', 'Womens E-Mail', 'No E-Mail'\n", + "\n", + "print(\"data shape:\", data.shape)\n", + "print(\"\\nspend (target) describe:\")\n", + "print(y_gain.describe())\n", + "\n", + "print(\"\\nsegment (treatment_raw) 분포:\")\n", + "print(treatment_raw.value_counts(normalize=True))\n", + "\n", + "data.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "2ff956f2", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Treatment 비율 (T=1): 0.66709375\n", + "\n", + "Y_gain (spend) 요약:\n", + "count 64000.000000\n", + "mean 1.050908\n", + "std 15.036448\n", + "min 0.000000\n", + "25% 0.000000\n", + "50% 0.000000\n", + "75% 0.000000\n", + "max 499.000000\n", + "Name: spend, dtype: float64\n", + "\n", + "Y_cost 분포:\n", + "segment\n", + "1.0 0.667094\n", + "0.0 0.332906\n", + "Name: proportion, dtype: float64\n" + ] + } + ], + "source": [ + "T = (treatment_raw != \"No E-Mail\").astype(int) # 이메일 받았으면 1, 아니면 0\n", + "\n", + "Y_gain = y_gain.astype(float) # spend (float)\n", + "Y_cost = T.astype(float) # 이메일 발송 비용 (0/1)\n", + "\n", + "print(\"Treatment 비율 (T=1):\", T.mean())\n", + "print(\"\\nY_gain (spend) 요약:\")\n", + "print(pd.Series(Y_gain).describe())\n", + "\n", + "print(\"\\nY_cost 분포:\")\n", + "print(pd.Series(Y_cost).value_counts(normalize=True).rename(\"proportion\"))\n" + ] + }, + { + "cell_type": "markdown", + "id": "bfdbc4fd", + "metadata": {}, + "source": [ + "### Feature 전처리\n", + "\n", + "Hillstrom의 주요 feature 예시:\n", + "\n", + "- `recency`, `history`, `mens`, `womens`, `newbie` 등: 숫자/0-1 변수\n", + "- `history_segment`, `zip_code`, `channel`: 범주형\n", + "\n", + "R-learner / Propensity 모델에 넣기 위해\n", + "\n", + "- 숫자형 컬럼은 그대로 사용하고,\n", + "- 범주형 컬럼(`history_segment`, `zip_code`, `channel`)은 one-hot 인코딩으로 변환합니다.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "e286adf5", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "원본 feature columns:\n", + "['recency', 'history_segment', 'history', 'mens', 'womens', 'zip_code', 'newbie', 'channel']\n", + "\n", + "Numeric columns:\n", + "['recency', 'history', 'mens', 'womens', 'newbie']\n", + "\n", + "Categorical columns (one-hot 대상):\n", + "['history_segment', 'zip_code', 'channel']\n", + "\n", + "전처리 후 feature shape: (64000, 15)\n", + "전처리된 feature columns:\n", + "Index(['recency', 'history', 'mens', 'womens', 'newbie',\n", + " 'history_segment_2) $100 - $200', 'history_segment_3) $200 - $350',\n", + " 'history_segment_4) $350 - $500', 'history_segment_5) $500 - $750',\n", + " 'history_segment_6) $750 - $1,000', 'history_segment_7) $1,000 +',\n", + " 'zip_code_Surburban', 'zip_code_Urban', 'channel_Phone', 'channel_Web'],\n", + " dtype='object')\n" + ] + } + ], + "source": [ + "print(\"원본 feature columns:\")\n", + "print(data.columns.tolist())\n", + "\n", + "# one-hot 대상 범주형 컬럼\n", + "categorical_cols = [\"history_segment\", \"zip_code\", \"channel\"]\n", + "\n", + "# 나머지는 숫자/0-1 컬럼으로 그대로 사용\n", + "numeric_cols = [c for c in data.columns if c not in categorical_cols]\n", + "\n", + "print(\"\\nNumeric columns:\")\n", + "print(numeric_cols)\n", + "print(\"\\nCategorical columns (one-hot 대상):\")\n", + "print(categorical_cols)\n", + "\n", + "# one-hot 인코딩\n", + "X_cat = pd.get_dummies(data[categorical_cols], drop_first=True)\n", + "X_num = data[numeric_cols].reset_index(drop=True)\n", + "\n", + "X_df = pd.concat([X_num, X_cat], axis=1)\n", + "\n", + "print(\"\\n전처리 후 feature shape:\", X_df.shape)\n", + "print(\"전처리된 feature columns:\")\n", + "print(X_df.columns)\n", + "\n", + "# numpy array로 변환\n", + "X = X_df.values.astype(np.float32)\n" + ] + }, + { + "cell_type": "markdown", + "id": "eeb08865", + "metadata": {}, + "source": [ + "데이터 세트는 각각 60%, 20%, 20%의 비율로 학습, 검증 및 테스트 세트의 3부분으로 나뉩니다." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "3d964183", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Train shape: (38400, 15)\n", + "Val shape: (12800, 15)\n", + "Test shape: (12800, 15)\n", + "\n", + "Treatment 비율 (Train/Val/Test):\n", + "Train: 0.6670833333333334\n", + "Val : 0.667109375\n", + "Test : 0.667109375\n" + ] + } + ], + "source": [ + "# Train / Validation / Test 분할\n", + "X_train_val, X_test, T_train_val, T_test, Yg_train_val, Yg_test, Yc_train_val, Yc_test = train_test_split(\n", + " X, T, Y_gain, Y_cost,\n", + " test_size=0.2,\n", + " random_state=RANDOM_STATE,\n", + " stratify=T,\n", + ")\n", + "\n", + "X_train, X_val, T_train, T_val, Yg_train, Yg_val, Yc_train, Yc_val = train_test_split(\n", + " X_train_val, T_train_val, Yg_train_val, Yc_train_val,\n", + " test_size=0.25, # 0.25 * 0.8 = 0.2\n", + " random_state=RANDOM_STATE,\n", + " stratify=T_train_val,\n", + ")\n", + "\n", + "print(\"Train shape:\", X_train.shape)\n", + "print(\"Val shape:\", X_val.shape)\n", + "print(\"Test shape:\", X_test.shape)\n", + "\n", + "print(\"\\nTreatment 비율 (Train/Val/Test):\")\n", + "print(\"Train:\", T_train.mean())\n", + "print(\"Val :\", T_val.mean())\n", + "print(\"Test :\", T_test.mean())\n" + ] + }, + { + "cell_type": "markdown", + "id": "576b8d00", + "metadata": {}, + "source": [ + "## Duality R-learner\n", + "\n", + "Duality R-learner는 다음 두 단계를 결합한 방식입니다.\n", + "\n", + "1. R-learner로 Gain/Cost CATE 추정\n", + " - $\\tau_r(x)$: gain uplift (예: spend uplift)\n", + " - $\\tau_c(x)$: cost uplift (예: 이메일 발송 비용 증가량)\n", + "\n", + "2. 예산 제약(budget constraint)을 듀얼 형태로 최적화\n", + " - 라그랑지 승수 $\\lambda$ 를 학습하여 최적 정책을 찾습니다.\n", + "\n", + "우리가 풀고 싶은 문제는 다음과 같습니다.\n", + "\n", + "- 예산 $B$ 하에서\n", + "\n", + " $$\n", + " \\max_{z_i \\in \\{0,1\\}} \\sum_i \\tau_r(x^{(i)}) z_i\n", + " \\quad \\text{s.t.} \\quad\n", + " \\sum_i \\tau_c(x^{(i)}) z_i \\le B\n", + " $$\n", + " \n", + "- $z_i = 1$ 이면 고객 $i$ 에게 이메일 발송, $z_i = 0$ 이면 미발송\n", + "\n", + "Duality R-learner 핵심 단계:\n", + "\n", + "1. Nuisance models: $m_r(x)$, $e(x)$ 학습 \n", + "2. Gain / Cost R-learner: $\\tau_r(x)$, $\\tau_c(x)$ 추정 \n", + "3. Duality: $\\lambda$ 를 gradient ascent 로 최적화 \n", + "4. 정책 생성: $s(x) = \\tau_r(x) - \\lambda^* \\tau_c(x)$\n", + "5. Cost Curve / AUCC 로 정책 성능 평가 " + ] + }, + { + "cell_type": "markdown", + "id": "5d9e213b", + "metadata": {}, + "source": [ + "### 1. Nuisance Models: $m_r(x)$ 와 $e(x)$\n", + "\n", + "R-learner는 아래 식을 기반으로 합니다.\n", + "\n", + "$$\n", + "Y - m^*(X)\n", + "= (T - e^*(X))\\,\\tau^*(X) + \\epsilon\n", + "$$\n", + "\n", + "여기서 \n", + "- $m^*(X) = \\mathbb{E}[Y \\mid X]$: outcome 평균 \n", + "- $e^*(X) = \\mathbb{P}(T=1 \\mid X)$: propensity score \n", + "\n", + "Gain outcome에 대한 nuisance 모델은 다음과 같이 구성합니다.\n", + "\n", + "- $m_r(x)$: Ridge 회귀 \n", + "- $e(x)$: Logistic 회귀 \n", + "\n", + "Cost outcome은 $Y^c = T$ 이므로\n", + "$m_c(x) = e(x)$" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "3e718594", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "== m_r(x) 성능 (R^2: spend 회귀) ==\n", + "Train R^2: 0.0011321804124860835\n", + "Val R^2: 0.0006200906912602333\n", + "\n", + "예측값 분포 (Val):\n", + "count 12800.000000\n", + "mean 0.998539\n", + "std 0.499257\n", + "min -4.757108\n", + "25% 0.690851\n", + "50% 0.961950\n", + "75% 1.234435\n", + "max 4.417754\n", + "dtype: float64\n" + ] + } + ], + "source": [ + "# Gain outcome 평균 모델 m_r(x): Ridge 회귀\n", + "m_r = Ridge(alpha=1.0, random_state=RANDOM_STATE)\n", + "m_r.fit(X_train, Yg_train)\n", + "\n", + "Yg_pred_train = m_r.predict(X_train)\n", + "Yg_pred_val = m_r.predict(X_val)\n", + "\n", + "r2_train = r2_score(Yg_train, Yg_pred_train)\n", + "r2_val = r2_score(Yg_val, Yg_pred_val)\n", + "\n", + "print(\"== m_r(x) 성능 (R^2: spend 회귀) ==\")\n", + "print(\"Train R^2:\", r2_train)\n", + "print(\"Val R^2:\", r2_val)\n", + "\n", + "print(\"\\n예측값 분포 (Val):\")\n", + "print(pd.Series(Yg_pred_val).describe())\n" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "1aa8d52b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "== e(x) 성능 (AUC: treatment 모델) ==\n", + "Train AUC: 0.5117265124259399\n", + "Val AUC: 0.49799591470904553\n", + "\n", + "Propensity e(x) range:\n", + "Train: 0.633101626124968 → 0.8026040280233658\n", + "Val : 0.6331449838328645 → 0.8183494704982611\n" + ] + } + ], + "source": [ + "# Propensity model e(x) = P(T=1 | X): Logistic 회귀\n", + "propensity = LogisticRegression(\n", + " penalty=\"l2\",\n", + " C=1.0,\n", + " solver=\"lbfgs\",\n", + " max_iter=1000,\n", + " n_jobs=-1,\n", + ")\n", + "\n", + "propensity.fit(X_train, T_train)\n", + "\n", + "e_train = propensity.predict_proba(X_train)[:, 1]\n", + "e_val = propensity.predict_proba(X_val)[:, 1]\n", + "e_test = propensity.predict_proba(X_test)[:, 1]\n", + "\n", + "auc_train_e = roc_auc_score(T_train, e_train)\n", + "auc_val_e = roc_auc_score(T_val, e_val)\n", + "\n", + "print(\"== e(x) 성능 (AUC: treatment 모델) ==\")\n", + "print(\"Train AUC:\", auc_train_e)\n", + "print(\"Val AUC:\", auc_val_e)\n", + "\n", + "print(\"\\nPropensity e(x) range:\")\n", + "print(\"Train:\", e_train.min(), \"→\", e_train.max())\n", + "print(\"Val :\", e_val.min(), \"→\", e_val.max())\n" + ] + }, + { + "cell_type": "markdown", + "id": "a2b17bd8", + "metadata": {}, + "source": [ + "여기서 얻은 ${e(x)}$ 는 이후에\n", + "\n", + "- Gain R-learner에서 ${T - e(x)}$ 항을 만들 때,\n", + "- Cost R-learner에서 ${m_c(x)}$ 로도 재사용합니다. " + ] + }, + { + "cell_type": "markdown", + "id": "06b5558e", + "metadata": {}, + "source": [ + "### 2. Gain R-learner: $\\tau_r(x)$\n", + "\n", + "Gain outcome $Y^r$ 에 대해 R-learner 구조는 다음과 같습니다.\n", + "\n", + "$$\n", + "Y^r - m_r(X)\n", + "= (T - e(X))\\,\\tau_r(X) + \\epsilon\n", + "$$\n", + "\n", + "선형 모델 $\\tau_r(x) = w_r^\\top x$ 를 사용하면 학습 절차는 다음과 같습니다.\n", + "\n", + "1. 잔차 계산 \n", + "\n", + " $$\n", + " r^Y = Y^r - \\hat m_r(X), \\quad r^T = T - \\hat e(X)\n", + " $$\n", + "\n", + "2. 행별 스케일링 \n", + "\n", + " $$\n", + " Z = X \\odot r^T\n", + " $$\n", + "\n", + "3. 회귀 \n", + "\n", + " $$\n", + " r^Y \\approx Z w_r\n", + " $$\n", + "\n", + "4. 최종 CATE \n", + "\n", + " $$\n", + " \\hat\\tau_r(x) = w_r^\\top x\n", + " $$" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "28eb5e7c", + "metadata": {}, + "outputs": [], + "source": [ + "def fit_r_learner_linear(\n", + " X_tr, X_val,\n", + " T_tr, T_val,\n", + " Y_tr, Y_val,\n", + " m_tr, m_val,\n", + " e_tr, e_val,\n", + " alpha=1.0,\n", + " name=\"R-learner\",\n", + "):\n", + " \"\"\"\n", + " 선형 τ(x) = w^T x 를 R-learner 방식으로 학습.\n", + " - X_tr, X_val: feature 행렬\n", + " - T_tr, T_val: treatment (0/1)\n", + " - Y_tr, Y_val: outcome\n", + " - m_tr, m_val: m(x) = E[Y|X] 예측값\n", + " - e_tr, e_val: e(x) = P(T=1|X) 예측값\n", + " \"\"\"\n", + " X_tr = np.asarray(X_tr)\n", + " X_val = np.asarray(X_val)\n", + " T_tr = np.asarray(T_tr).astype(float)\n", + " T_val = np.asarray(T_val).astype(float)\n", + " Y_tr = np.asarray(Y_tr).astype(float)\n", + " Y_val = np.asarray(Y_val).astype(float)\n", + " m_tr = np.asarray(m_tr).astype(float)\n", + " m_val = np.asarray(m_val).astype(float)\n", + " e_tr = np.asarray(e_tr).astype(float)\n", + " e_val = np.asarray(e_val).astype(float)\n", + "\n", + " # residuals\n", + " rY_tr = Y_tr - m_tr\n", + " rT_tr = T_tr - e_tr\n", + "\n", + " # Z = X * rT (각 행을 rT로 스케일링)\n", + " Z_tr = X_tr * rT_tr.reshape(-1, 1)\n", + "\n", + " # 회귀: rY ~ Z\n", + " tau_model = Ridge(alpha=alpha, fit_intercept=False, random_state=RANDOM_STATE)\n", + " tau_model.fit(Z_tr, rY_tr)\n", + "\n", + " # τ_hat(x) = w^T x\n", + " tau_tr = tau_model.predict(X_tr)\n", + " tau_val = tau_model.predict(X_val)\n", + "\n", + " print(f\"== {name} 요약 ==\")\n", + " print(\"Train τ_hat summary:\")\n", + " print(pd.Series(tau_tr).describe())\n", + " print(\"\\nVal τ_hat summary:\")\n", + " print(pd.Series(tau_val).describe())\n", + "\n", + " return tau_model, tau_tr, tau_val\n" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "id": "03fc2e68", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "== Gain R-learner τ_r(x) 요약 ==\n", + "Train τ_hat summary:\n", + "count 38400.000000\n", + "mean 0.562616\n", + "std 0.454530\n", + "min -2.669276\n", + "25% 0.265676\n", + "50% 0.528041\n", + "75% 0.837171\n", + "max 1.976344\n", + "dtype: float64\n", + "\n", + "Val τ_hat summary:\n", + "count 12800.000000\n", + "mean 0.567048\n", + "std 0.453703\n", + "min -3.313805\n", + "25% 0.272715\n", + "50% 0.530470\n", + "75% 0.836859\n", + "max 1.944813\n", + "dtype: float64\n" + ] + } + ], + "source": [ + "# m_r(x) 예측값\n", + "m_r_train = m_r.predict(X_train)\n", + "m_r_val = m_r.predict(X_val)\n", + "\n", + "tau_r_model, tau_r_train, tau_r_val = fit_r_learner_linear(\n", + " X_tr=X_train,\n", + " X_val=X_val,\n", + " T_tr=T_train,\n", + " T_val=T_val,\n", + " Y_tr=Yg_train,\n", + " Y_val=Yg_val,\n", + " m_tr=m_r_train,\n", + " m_val=m_r_val,\n", + " e_tr=e_train,\n", + " e_val=e_val,\n", + " alpha=1.0,\n", + " name=\"Gain R-learner τ_r(x)\",\n", + ")\n" + ] + }, + { + "cell_type": "markdown", + "id": "e8ae0ccd", + "metadata": {}, + "source": [ + "### 3. Cost R-learner: $\\tau_c(x)$\n", + "\n", + "Cost outcome은 $Y^c = T$ 이므로 \n", + "nuisance model은 이미\n", + "\n", + "$$m_c(x) = e(x)$$\n", + "\n", + "입니다.\n", + "\n", + "Cost R-learner 식은\n", + "\n", + "$$\n", + "Y^c - m_c(X)\n", + "= (T - e(X))\\,\\tau_c(X)\n", + "$$\n", + "\n", + "Gain과 동일한 R-learner 구조로 $\\tau_c(x)$ 를 학습합니다." + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "f40867a2", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "== Cost R-learner τ_c(x) 요약 ==\n", + "Train τ_hat summary:\n", + "count 38400.000000\n", + "mean 0.974710\n", + "std 0.156975\n", + "min 0.429515\n", + "25% 0.894193\n", + "50% 0.978889\n", + "75% 1.046973\n", + "max 2.098633\n", + "dtype: float64\n", + "\n", + "Val τ_hat summary:\n", + "count 12800.000000\n", + "mean 0.974875\n", + "std 0.157403\n", + "min 0.424263\n", + "25% 0.894096\n", + "50% 0.979144\n", + "75% 1.046372\n", + "max 2.265868\n", + "dtype: float64\n" + ] + } + ], + "source": [ + "# Cost outcome 평균 m_c(x)는 e(x)를 그대로 사용\n", + "m_c_train = e_train\n", + "m_c_val = e_val\n", + "\n", + "tau_c_model, tau_c_train, tau_c_val = fit_r_learner_linear(\n", + " X_tr=X_train,\n", + " X_val=X_val,\n", + " T_tr=T_train,\n", + " T_val=T_val,\n", + " Y_tr=Yc_train,\n", + " Y_val=Yc_val,\n", + " m_tr=m_c_train,\n", + " m_val=m_c_val,\n", + " e_tr=e_train,\n", + " e_val=e_val,\n", + " alpha=1.0,\n", + " name=\"Cost R-learner τ_c(x)\",\n", + ")\n" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "id": "48f3ae13", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "== τ_r(x) 요약 ==\n", + "[Train]\n", + "count 38400.000000\n", + "mean 0.562616\n", + "std 0.454530\n", + "min -2.669276\n", + "25% 0.265676\n", + "50% 0.528041\n", + "75% 0.837171\n", + "max 1.976344\n", + "dtype: float64\n", + "\n", + "[Val]\n", + "count 12800.000000\n", + "mean 0.567048\n", + "std 0.453703\n", + "min -3.313805\n", + "25% 0.272715\n", + "50% 0.530470\n", + "75% 0.836859\n", + "max 1.944813\n", + "dtype: float64\n", + "\n", + "[Test]\n", + "count 12800.000000\n", + "mean 0.568442\n", + "std 0.450854\n", + "min -2.486865\n", + "25% 0.268170\n", + "50% 0.534959\n", + "75% 0.841292\n", + "max 1.970332\n", + "dtype: float64\n", + "\n", + "== τ_c(x) 요약 ==\n", + "[Train]\n", + "count 38400.000000\n", + "mean 0.974710\n", + "std 0.156975\n", + "min 0.429515\n", + "25% 0.894193\n", + "50% 0.978889\n", + "75% 1.046973\n", + "max 2.098633\n", + "dtype: float64\n", + "\n", + "[Val]\n", + "count 12800.000000\n", + "mean 0.974875\n", + "std 0.157403\n", + "min 0.424263\n", + "25% 0.894096\n", + "50% 0.979144\n", + "75% 1.046372\n", + "max 2.265868\n", + "dtype: float64\n", + "\n", + "[Test]\n", + "count 12800.000000\n", + "mean 0.975139\n", + "std 0.155813\n", + "min 0.436810\n", + "25% 0.895776\n", + "50% 0.978849\n", + "75% 1.045682\n", + "max 1.803177\n", + "dtype: float64\n" + ] + } + ], + "source": [ + "# Test set CATE 예측\n", + "tau_r_test = tau_r_model.predict(X_test)\n", + "tau_c_test = tau_c_model.predict(X_test)\n", + "\n", + "print(\"== τ_r(x) 요약 ==\")\n", + "print(\"[Train]\")\n", + "print(pd.Series(tau_r_train).describe())\n", + "print(\"\\n[Val]\")\n", + "print(pd.Series(tau_r_val).describe())\n", + "print(\"\\n[Test]\")\n", + "print(pd.Series(tau_r_test).describe())\n", + "\n", + "print(\"\\n== τ_c(x) 요약 ==\")\n", + "print(\"[Train]\")\n", + "print(pd.Series(tau_c_train).describe())\n", + "print(\"\\n[Val]\")\n", + "print(pd.Series(tau_c_val).describe())\n", + "print(\"\\n[Test]\")\n", + "print(pd.Series(tau_c_test).describe())\n" + ] + }, + { + "cell_type": "markdown", + "id": "e6e387a2", + "metadata": {}, + "source": [ + "### 4. Duality: 예산 제약 하에서 라그랑지안 기반 $\\lambda$ 최적화\n", + "\n", + "우리가 풀고자 하는 문제는 다음과 같습니다.\n", + "\n", + "$$\n", + "\\begin{aligned}\n", + "\\max_{z_i \\in \\{0,1\\}} &\\quad \\sum_i \\tau_r(x^{(i)}) z_i \\\\\n", + "\\text{s.t.} &\\quad \\sum_i \\tau_c(x^{(i)}) z_i \\le B.\n", + "\\end{aligned}\n", + "$$\n", + "\n", + "여기서 $z_i = 1$ 은 고객 $i$를 타겟팅하는 경우이며, $B$는 전체 예산입니다. \n", + "이 제약을 다루기 위해 라그랑지 승수 $\\lambda \\ge 0$ 를 도입하면 라그랑지안은\n", + "\n", + "$$\n", + "L(z,\\lambda)\n", + "= -\\sum_i \\tau_r(x^{(i)}) z_i\n", + "+ \\lambda\\left(\\sum_i \\tau_c(x^{(i)}) z_i - B\\right)\n", + "$$\n", + "\n", + "으로 표현됩니다.\n", + "\n", + "고정된 $\\lambda$ 아래에서 고객 $i$의 효율성 점수는 다음과 같습니다.\n", + "\n", + "$$\n", + "s_i(\\lambda) = \\tau_r(x^{(i)}) - \\lambda\\, \\tau_c(x^{(i)}).\n", + "$$\n", + "\n", + "점수가 양수이면 타겟팅하는 것이 유리하므로 \n", + "$s_i(\\lambda) \\ge 0$ 이면 $z_i = 1$, 음수이면 $z_i = 0$ 을 선택합니다. \n", + "즉, $\\lambda$가 주어지면 단순히 $s_i(\\lambda)$가 양수인 고객만 선택하면 됩니다.\n", + "\n", + "듀얼 목적함수의 기울기는\n", + "\n", + "$$\n", + "\\frac{\\partial g}{\\partial \\lambda}\n", + "\\approx \\sum_i z_i \\tau_c(x^{(i)}) - B\n", + "$$\n", + "\n", + "으로 근사할 수 있고, 이에 따른 gradient ascent 업데이트는\n", + "\n", + "$$\n", + "\\lambda \\leftarrow \\bigl[\\lambda + \\eta(\\text{cost\\_used} - B)\\bigr]_+\n", + "$$\n", + "\n", + "로 진행됩니다. 여기서 $[\\cdot]_+$ 는 $\\lambda$가 음수가 되지 않도록 하는 projection입니다.\n", + "\n", + "예산을 초과하면 $(\\text{cost\\_used} > B)$ $\\lambda$는 증가하여 비용 효과를 더 강하게 억제하고, \n", + "예산보다 적게 사용하면 $\\lambda$는 감소하여 더 많은 고객이 선택될 수 있도록 조정됩니다.\n", + "\n", + "Train 데이터에서 양의 Cost CATE 합을 기반으로 예산 $B$를 설정하고, \n", + "위 규칙을 반복 적용하여 최종 $\\lambda^*$와 정책을 학습합니다." + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "id": "5fe3e686", + "metadata": {}, + "outputs": [], + "source": [ + "def duality_learn_lambda(\n", + " tau_r,\n", + " tau_c,\n", + " budget_fraction=0.3,\n", + " lr=1e-5,\n", + " n_iter=200,\n", + " verbose_every=20,\n", + "):\n", + " \"\"\"\n", + " τ_r, τ_c 가 주어졌을 때 Duality gradient ascent로 λ 학습.\n", + " - budget_fraction: 전체 양의 cost effect 합 중 몇 %를 예산으로 둘지\n", + " \"\"\"\n", + " tau_r = np.asarray(tau_r).astype(float)\n", + " tau_c = np.asarray(tau_c).astype(float)\n", + "\n", + " # 양의 cost effect만 예산 계산에 사용\n", + " tau_c_pos = np.clip(tau_c, a_min=0.0, a_max=None)\n", + " total_pos_cost = tau_c_pos.sum()\n", + " B = budget_fraction * total_pos_cost\n", + "\n", + " lam = 0.0\n", + "\n", + " for it in range(n_iter + 1):\n", + " # effectiveness score\n", + " s = tau_r - lam * tau_c\n", + "\n", + " # z_i: 선택 여부 (s_i >= 0 이면 선택)\n", + " z = (s >= 0).astype(float)\n", + "\n", + " cost_used = (tau_c_pos * z).sum()\n", + " gain_used = (np.clip(tau_r, 0.0, None) * z).sum()\n", + "\n", + " # ∂g/∂λ ≈ cost_used - B\n", + " grad = cost_used - B\n", + "\n", + " # gradient ascent (λ >= 0 유지)\n", + " lam = max(0.0, lam + lr * grad)\n", + "\n", + " if it % verbose_every == 0:\n", + " sel_ratio = z.mean()\n", + " print(\n", + " f\"[iter {it:03d}] λ={lam:.6f}, \"\n", + " f\"cost_used={cost_used:.4f}, gain_used={gain_used:.4f}, \"\n", + " f\"grad={grad:.4f}, selected={sel_ratio:.3f}\"\n", + " )\n", + "\n", + " print(\"\\n최종 λ*:\", lam)\n", + " print(\"총 양의 cost effect 합:\", total_pos_cost)\n", + " print(f\"예산 B (fraction={budget_fraction}):\", B)\n", + " return lam, B\n" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "id": "233a6a45", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[iter 000] λ=0.228487, cost_used=34077.3618, gain_used=22363.6067, grad=22848.7019, selected=0.907\n", + "[iter 020] λ=0.784009, cost_used=11251.8179, gain_used=12501.7237, grad=23.1579, selected=0.298\n", + "[iter 040] λ=0.784586, cost_used=11228.2803, gain_used=12483.2661, grad=-0.3797, selected=0.297\n", + "[iter 060] λ=0.784587, cost_used=11228.2803, gain_used=12483.2661, grad=-0.3797, selected=0.297\n", + "[iter 080] λ=0.784587, cost_used=11228.2803, gain_used=12483.2661, grad=-0.3797, selected=0.297\n", + "[iter 100] λ=0.784587, cost_used=11228.2803, gain_used=12483.2661, grad=-0.3797, selected=0.297\n", + "[iter 120] λ=0.784588, cost_used=11228.2803, gain_used=12483.2661, grad=-0.3797, selected=0.297\n", + "[iter 140] λ=0.784588, cost_used=11229.1281, gain_used=12483.9314, grad=0.4682, selected=0.297\n", + "[iter 160] λ=0.784588, cost_used=11229.1281, gain_used=12483.9314, grad=0.4682, selected=0.297\n", + "[iter 180] λ=0.784589, cost_used=11229.1281, gain_used=12483.9314, grad=0.4682, selected=0.297\n", + "[iter 200] λ=0.784589, cost_used=11229.1281, gain_used=12483.9314, grad=0.4682, selected=0.297\n", + "\n", + "최종 λ*: 0.7845891317539325\n", + "총 양의 cost effect 합: 37428.86642372066\n", + "예산 B (fraction=0.3): 11228.659927116198\n" + ] + } + ], + "source": [ + "lambda_star, B = duality_learn_lambda(\n", + " tau_r=tau_r_train,\n", + " tau_c=tau_c_train,\n", + " budget_fraction=0.3,\n", + " lr=1e-5,\n", + " n_iter=200,\n", + " verbose_every=20,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "id": "a54ac8cf", + "metadata": {}, + "outputs": [], + "source": [ + "def selection_summary(tau_r, tau_c, lam, name=\"\"):\n", + " tau_r = np.asarray(tau_r).astype(float)\n", + " tau_c = np.asarray(tau_c).astype(float)\n", + "\n", + " s = tau_r - lam * tau_c\n", + " z = (s >= 0).astype(float)\n", + "\n", + " gain_pos = np.clip(tau_r, 0.0, None)\n", + " cost_pos = np.clip(tau_c, 0.0, None)\n", + "\n", + " gain_used = (gain_pos * z).sum()\n", + " cost_used = (cost_pos * z).sum()\n", + " sel_ratio = z.mean()\n", + "\n", + " ratio = gain_used / cost_used if cost_used > 0 else np.nan\n", + "\n", + " print(f\"\\n== Selection summary ({name}) ==\")\n", + " print(f\"λ = {lam:.6f}\")\n", + " print(f\"선택 비율: {sel_ratio:.3f} ({z.sum():.0f} / {len(z)})\")\n", + " print(f\"총 gain (∑ τ_r^+ z): {gain_used:.4f}\")\n", + " print(f\"총 cost (∑ τ_c^+ z): {cost_used:.4f}\")\n", + " print(f\"gain / cost 비율: {ratio:.4f}\")\n", + "\n", + " return {\n", + " \"lambda\": lam,\n", + " \"selected_ratio\": sel_ratio,\n", + " \"gain_used\": gain_used,\n", + " \"cost_used\": cost_used,\n", + " \"gain_per_cost\": ratio,\n", + " }\n" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "id": "6947da6a", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "== Selection summary (Train) ==\n", + "λ = 0.784589\n", + "선택 비율: 0.297 (11420 / 38400)\n", + "총 gain (∑ τ_r^+ z): 12483.2661\n", + "총 cost (∑ τ_c^+ z): 11228.2803\n", + "gain / cost 비율: 1.1118\n", + "\n", + "== Selection summary (Val) ==\n", + "λ = 0.784589\n", + "선택 비율: 0.293 (3753 / 12800)\n", + "총 gain (∑ τ_r^+ z): 4124.4445\n", + "총 cost (∑ τ_c^+ z): 3685.4466\n", + "gain / cost 비율: 1.1191\n", + "\n", + "== Selection summary (Test) ==\n", + "λ = 0.784589\n", + "선택 비율: 0.302 (3862 / 12800)\n", + "총 gain (∑ τ_r^+ z): 4210.7541\n", + "총 cost (∑ τ_c^+ z): 3794.4139\n", + "gain / cost 비율: 1.1097\n" + ] + } + ], + "source": [ + "_ = selection_summary(tau_r_train, tau_c_train, lambda_star, name=\"Train\")\n", + "_ = selection_summary(tau_r_val, tau_c_val, lambda_star, name=\"Val\")\n", + "_ = selection_summary(tau_r_test, tau_c_test, lambda_star, name=\"Test\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "6ec8b309", + "metadata": {}, + "source": [ + "### 5. Cost Curve & AUCC\n", + "\n", + "Cost Curve 와 그 면적(AUCC, Area Under Cost Curve)로 비용 대비 uplift 모델을 평가합니다.\n", + "\n", + "Test 셋에서:\n", + "\n", + "1. Duality 점수 ${s(x) = \\tau_r(x) - \\lambda^* \\tau_c(x)}$ 기준으로 내림차순 정렬\n", + "2. 정렬된 순서대로\n", + " - ${\\tau_r^+(x) = \\max(\\tau_r(x), 0)}$\n", + " - ${\\tau_c^+(x) = \\max(\\tau_c(x), 0)}$\n", + " 의 누적합 계산\n", + "3. 누적 cost/gain 을 각각 최종값으로 나누어 ${[0,1]}$ 범위로 정규화\n", + "4. $(0,0)$ 에서 $(1,1)$ 까지 이어지는 곡선을 Cost Curve 로 사용\n", + "5. 수치 적분으로 AUCC 계산:\n", + " $$\n", + " \\text{AUCC} = \\int_0^1 \\text{gain}(x)\\,dx\n", + " $$\n", + "\n", + "비교를 위해 랜덤 ranking 의 Cost Curve 와 AUCC 도 함께 계산합니다.\n", + "\n", + "- AUCC ${\\approx 0.5}$: 랜덤에 가까운 정책\n", + "- AUCC ${>} 0.5$: 효율적인 고객부터 잘 고르는 정책\n" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "id": "56c9cc3e", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "== Test set Cost Curve (τ 기반) ==\n", + "max_cost: 12481.778185597159\n", + "max_gain: 7511.766716461641\n", + "Normalized AUCC: 0.6946670819574676\n" + ] + } + ], + "source": [ + "# Duality R-learner 기반 effectiveness score (Test set)\n", + "s_test = tau_r_test - lambda_star * tau_c_test\n", + "\n", + "# score 기준 내림차순 정렬\n", + "order = np.argsort(-s_test)\n", + "tau_r_sorted = np.clip(tau_r_test[order], 0.0, None) # gain은 양수 부분만\n", + "tau_c_sorted = np.clip(tau_c_test[order], 0.0, None) # cost도 양수 부분만\n", + "\n", + "# 누적 cost / gain\n", + "cum_cost = np.cumsum(tau_c_sorted)\n", + "cum_gain = np.cumsum(tau_r_sorted)\n", + "\n", + "# 0 지점 포함\n", + "cum_cost = np.insert(cum_cost, 0, 0.0)\n", + "cum_gain = np.insert(cum_gain, 0, 0.0)\n", + "\n", + "# 정규화\n", + "max_cost = cum_cost[-1]\n", + "max_gain = cum_gain[-1]\n", + "\n", + "x = cum_cost / max_cost\n", + "y = cum_gain / max_gain\n", + "\n", + "# AUCC 계산\n", + "aucc = np.trapz(y, x)\n", + "\n", + "print(\"== Test set Cost Curve (τ 기반) ==\")\n", + "print(\"max_cost:\", max_cost)\n", + "print(\"max_gain:\", max_gain)\n", + "print(\"Normalized AUCC:\", aucc)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "id": "3b6badd9", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Random ranking AUCC: 0.5006872435398204\n" + ] + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAk4AAAHqCAYAAADyPMGQAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjcsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvTLEjVAAAAAlwSFlzAAAPYQAAD2EBqD+naQAAp1BJREFUeJzs3Qd4U9X7B/Bv96RltKVQoKXssilQ9iwCAoKoICBLRRyoP5AhigwXToYKbnCBiAg4GLKRUfbee7eU7r2S+3/eg+m/LS220DRN8v34RDJubk7OTXPfnPEeG03TNBARERHRf7L9702IiIiIiIETERERURGwxYmIiIiokBg4ERERERUSAyciIiKiQmLgRERERFRIDJyIiIiIComBExEREVEhMXAiIiIiKiQGTkREVuDBBx/EqFGjim1/06dPh42NDaKiomBOAgICMGLEiOzba9euhbu7O27dumXScpH5YOBEVun8+fMYPXo0AgMD4ezsDA8PD7Rt2xZz585Fampqsb9eSkqKOtFs2bKl2PdNwKVLl9RJvDAX2fZ+3bhxQx3PQ4cOlXj179y5U712XFxcoZ+zY8cOrFu3DpMmTTJq2cxRjx49ULNmTcycOdPURSEzYW/qAhCVtFWrVuGxxx6Dk5MThg0bhgYNGiAjIwPbt2/HhAkTcPz4cXz11VfFHjjNmDFDXe/UqVOx7psAb29v/Pjjj7mq4uOPP8a1a9cwe/bsO7YtjsBJjqe0XjRp0qTEAyd5bWk1KVu2bKGe8+GHH6Jr164qQKA7yY+o8ePHq3otU6YMq4juioETWZWLFy/i8ccfh7+/PzZt2oRKlSplP/bCCy/g3LlzKrCyBhLMubq6whK4ubnhiSeeyHXfkiVLEBsbe8f91iYyMlJ9pr/44gtTF6XUeuSRR/Diiy/i119/xZNPPmnq4lApx646sioffPABkpKS8O233+YKmgzkF/nLL7+cfTsrKwtvvfUWatSooVqopIXhtddeQ3p6eq7n7du3D927d4eXlxdcXFxQvXr17C9g6RoytHLIL1pDl5F0t9yNdMWMHTtWvaa8dpUqVVQLmWFMyXfffZdv15N0B8r9ObsFpZVLWtb279+PDh06qIBJ3kfv3r1Vd2V+WrdujebNm+e676effkJwcLB6j+XLl1dB6NWrV1EYBw8eRM+ePVW3qIwpkRaQXbt25drG8J6ka2ncuHGq3iQoevjhh4tlDIoct2nTpqnjLHVatWpVTJw48Y7juX79erRr10616EhZ69Spo+pLSL22aNFCXR85cmT28ZSyFyQxMRH/+9//so+lj48PunXrhgMHDuTabvfu3arryNPTUx2jjh07qrowkM+MtIoK+YwVpvtRgib5HIeGhhapa7Ow3cryeRwwYIA6rhUqVFB/P2lpabm2WbhwIbp06aLet7z/oKAgfP7553fs625/RwZ6vR5z5sxB/fr1VTd7xYoVVYuRBMk5aZqGt99+W/3dSF127txZtSbnR8rVqFEj/P7774V6z2Td2OJEVuXPP/9UgUKbNm0Ktf3TTz+N77//Ho8++iheeeUVdWKTsRAnT57EihUrsn/RP/DAA+ok/+qrr6qTrZyYli9frh6X++Uk8dxzz6kAoH///up++aIuiAR37du3V68jJ45mzZqpE9Qff/yhup/kxFJU0dHRKnCRYEdaYeSEI0GQBGN79+7NDgbE5cuXVVAjXTwG77zzDt544w11kpR6kUDm008/VYGYBEV36zaSE5a8Hzm5SqDi4OCAL7/8UgV0W7duRUhISK7t5dd/uXLlVJAjdSknyjFjxuCXX37BvZIT7kMPPaS6ZJ955hnUq1cPR48eVV15Z86cwcqVK7PLKgGlHJ8333xTneilJdIQwMjz5P6pU6eq/cj7Enf7TD377LNYtmyZeg8SNMixkHLI8ZVjK6QFVI6PHBN537a2ttkBx7Zt29CyZUv12ZGy/vzzz6rchs/B3bofpWtPAhppZS2oa1OOrXzmco7zkfdZGPJ5kIBQniufmU8++UQFMT/88EP2NvL5l0BH6t/e3l79HT7//PPqmEhLb2H+jgwkSJIgVYLWl156SbUif/bZZ+ozKMdIPltCjo8ETjIoXi4SpMr+pVs+P1Lvhs8A0V1pRFYiPj5ek4983759C7X9oUOH1PZPP/10rvvHjx+v7t+0aZO6vWLFCnV77969Be7r1q1baptp06YV6rWnTp2qtl++fPkdj+n1evXvwoUL1TYXL17M9fjmzZvV/fKvQceOHdV9X3zxxR114uTkpL3yyiu57v/ggw80Gxsb7fLly+r2pUuXNDs7O+2dd97Jtd3Ro0c1e3v7O+7Pq1+/fpqjo6N2/vz57Ptu3LihlSlTRuvQoUP2fYb3FBoamv0+xdixY9Xrx8XFaYXVq1cvzd/fP/v2jz/+qNna2mrbtm3LtZ3Uibzmjh071O3Zs2er23LMCiLHWraR8haGp6en9sILLxT4uLzXWrVqad27d8/1vlNSUrTq1atr3bp1y77vww8/zPe4F6Rdu3ZacHDwXbeRz0f9+vW1opDPspTjoYceynX/888/r+4/fPhwrveRl7zXwMDA7NuF+TuSYyfbLFq0KNf9a9euzXV/ZGSk+rzJZyBnfb722mtqu+HDh9+x73fffVc9dvPmzULXAVkndtWR1UhISFD/Fnbw5+rVq9W/0mWUk7Q8CcNYKENLy19//YXMzMxiKetvv/2Gxo0bqxaqvKQb5V5Iy4n8Ss9JWoCklWPp0qWqa8NAWnZatWqFatWqqdvyq19aB6R1QVq+DBdfX1/UqlULmzdvLvB1dTqdmtHVr1+/XN2C0lU6ePBg1fJiODYG0pKT831Kq47sR1rC7pWMX5FWlLp16+Z6D9KiIwzvwXA8pdtG3nNxkH1Ka6UMKs+PzM47e/asqg9pjTKULTk5WXVp/vPPP/dcFtmftN4Zi6HFKGdrYc6/HyHdbgbx8fHqvUk35IULF9Ttwv4dyTGUbkzp5sx5DKW1SLpUDcdww4YNqmVJypLzcyTdpQUx1JG5pVegksfAiayGBAmG8SaFISdp6S7JOxNJggX5kjecxOUEIINLZfySdJ307dtXdbHkHTdT1HQJMiapOPn5+cHR0fGO+wcOHKjGKYWFhWW/toyFkvsN5KQugZUESdKVkvMi3U3SzVIQ6dKTgegyTigvCWQkIMg7TsoQsOU9qeUdx1IU8h6kGy5v+WvXrq0eN7wHed+SmkK6I6U7U7o2JbC8nyBKxtYdO3ZMjamSLjcZqyRBQ86yieHDh99Rvm+++UZ9lgwBxr3IGRQXhQSrERERuS55u7rkM5GTjAeUv5uc466kC03GWMl4NfnbkfdlGDNmeF+F+TuSepLtZUxS3nqSrkbDMTT8beYtm2xXUBBpqKN7/WFC1oNjnMiqAqfKlSurE1hR/NcXqTwu41dkfIeM3fj777/VuCSZDi/3yS9hYyioXHKyy0/OX/059enTRw2eleBAxunIv3Lik5QNBhI0yOutWbMGdnZ2d+yjuN9jfq9xPwGA4T00bNgQs2bNyvdxCWoM9SQtPNJ6Ia2KkiBRWuCkZUpazgoq291IS520msm4ONmHjB17//33VUuetPgZgjK5v6D0BvdaxzK+6V4DTgloZYB2TlIvd0upkfdzKYG4tJpJS5/UvdSzBPDSIiXjtAzvvTB/R7KtBE2LFi3K97XvJ9WEoY7uZfwgWRcGTmRVZNCv5GiS1hWZNXY3MphWvqjlV27OgbI3b95UM95yDrYV0rUlFxlou3jxYgwZMkRNiZeWi6L+ipVf7f8V4Bl+OedNhFjU7ixpBZB6kW4QObFJkCAneQkyc5ZHghY5iRpaaApLTmYSmJ0+ffqOx06dOqWCNEPQYkzyHg4fPqxO4v91PKRMsp1cpE7effddvP766ypokJaTe2mVkK5JGRAtF2kZkUHh8lmRwEnKZgjuc85+y09RX1sCFun6vRsJBvNrUZPWVZlhmJN0Ieckfx85gysZSC/7kgHjQoIgaTWSiQ05WxIL6t6929+R1JN0w0mLYEE/BIThb1PKlrN7WFo/CwoiZZC5BE3FkeeLLBu76siqyIwuCRTkS1gCoLzk17FkDxcyE0fIjK6cDC0WvXr1Uv/KF3HelhBDq4Ghm8GQL6mw2Z6ly0JO8oaZezkZXstwspXWkZytTfeSvFO6p2T8jXQLyevm7KYTMptLTq7SjZL3vcptGUdTEHmezGaSMUM5u2+k/uXEKNP+Dd2oxiStPtevX8fXX399x2OSLV7GE4mYmJg7Hs97POUzVNjjKcckbzebtJpIYGrYn4zRkeP50UcfqS6nvHKmYijKawv5gSCf0Zxdg3lJefIb2yPT/SWQy3nJ29U1b968XLdlpqWQgFAYWuhyfm6kPqQbLqfC/B3JMZT6lBQheUnKBUOdSDlldp2UJec+8/4t5yTd0//1Y4pIsMWJrIqcnORkLYGBtCLlzBwu07al1cWwjpX8spYxJxKIyBeyjMHYs2ePSk8gA50lL4yQ2/Pnz1cDuWX/MoZKTs4SDBiCL/l1LNPQpTVHWmwkB5K8bkHjmCRXj3RbSHeZdFfIiVVO6PKrXRIZStlkerf8Mp88ebJ6TPYpv8zlBFJUUk4ZNC/Zk+VEJ4Fb3nqTqd3yWhL8yPuX7eVXugR3MphbnlsQea4hN5K0uMiUdElHICdEGf9TEoYOHaq6ISU1gLR2SKuFnISl1Uvul64hyVslqQYkGJXAWFoupHVIjq/kA5LyG+pDxurIsZB6kGBGUirk7dYS8nmQ50pKCzlu0uUkrSaSAkK6oQwtXBK0SrAhx1UG8cuYNAn0pKzyWZKWGyGfBSEtYDL+SgIE6W41BFR5yfuQ+pbXlOOUHxlsLZ8dGUwtwYMMUi8s+QxImgHJPyUtuZLrS55vaJmSoFm65qSMkkpAAkP5+5BgLTw8PHs/hfk7kr9B2YekPpAB9bJvef/SsiR/u/KjR+pZWo3k8yjbSWuqPF/SFUhXc35dcXKMjxw5csdAd6J8mXpaH5EpnDlzRhs1apQWEBCgpi3LtPi2bdtqn376qZaWlpa9XWZmpjZjxgw1JdzBwUGrWrWqNnny5FzbHDhwQBs0aJBWrVo1NbXfx8dH6927t7Zv375cr7lz5041LVxerzCpCaKjo7UxY8Zofn5+6jlVqlRR06ijoqKyt5Hp/TJ1X163YsWKarr1+vXr801H8F/TzYcMGZKdCqAgv/32m5re7ubmpi5169ZV0+xPnz59130b6kmmoLu7u2uurq5a586dVZ3kZEhHkHdKen4pFoqajkBkZGRo77//vqoLqbNy5cqpYyLHWFIziI0bN6qUFZUrV1b1Lv/K8ZXPTE6///67FhQUpNIx3C01QXp6ujZhwgStcePG6nMm9SbX58+ff8e2Bw8e1Pr3769VqFBBlU/KP2DAAFWmnN566y31uZD0CoVJTSApA7p27Vrg41lZWaqMvr6+mrOzs1aUdAQnTpzQHn30UfXepD7lM5uamppr2z/++ENr1KiR2rf8zckxWLBgQa6yF/bvSHz11VfquLm4uKjXbdiwoTZx4kSV4sJAp9Op41qpUiW1XadOnbRjx46pOs2bjuDzzz9Xn8mEhIRCvXeybjbyv/xDKiIisgSSQFMGdEvrWt6ZZgQ0bdpU1U/edQ2J8sPAiYjICkg3oHQZ5jfGy5rJrEnp3pMxYNJ9SPRfGDgRERERFRJn1REREREVEgMnIiIiokJi4ERERERUSAyciIiIiArJ6hJgylIAkiFZktZxMUciIiLSNE0lXZWM/pKQ9m6sLnCSoKkk1sUiIiIi8yILW0vajruxusBJWpoMlWOM9bGkRUvWlZKU//8VtRLr3dzx8866tzb8zFtmvSckJKhGFUOMcDdWFzgZuuckaDJW4JSWlqb2zcCp5LDeTYP1bjqse9a7NdGX0Lm1MEN42CRCREREVEgMnIiIiIgKiYETERERUSFZ3RinwtLpdMjMzLynflh5nvTFcoxTyWG9mwbrPTdHR0f+3RNZOAZO+eRyiIiIQFxc3D0/X04mkg+CeaJKDuvdNFjvucmPperVq6sAiogsEwOnPAxBk4+PD1xdXYsc/MiJJCsrC/b29gycShDr3TRY73cm1w0PD0e1atX4909koRg45emeMwRNFSpUuKcK5YnENFjvrPfSQHLMSPAkP54cHBxMXRwiMgIODs/BMKZJWpqIiIrK0EUnP8KIyDIxcMoHxyYR0b3gdweR5WPgRERERFRIDJzonowYMQL9+vXLvt2pUyf873//KzW1WdrKU5xOnz4NX19fNXOTSk5GRgYCAgKwb98+VjuRFTNp4PTPP/+gT58+qFy5smriXrly5X8+Z8uWLWjWrBmcnJxQs2ZNfPfddyVSVnMIZKQO5SKDUitWrIhu3bphwYIFaraPsS1fvhxvvfVW9m05wcyZM6fY35dM9Z44caLKk2WtJk+ejBdffDHfxSjr1q2r/jZkdmheBR2T6dOno0mTJrnuk+fLawQGBqr9yeKX8re6cePGXNsdPHgQjz/+uArknJ2dUatWLYwaNQpnzpy55/f366+/qvch+2vYsCFWr179n89JT0/H66+/Dn9/f1Veea/y2c85fvHNN99EjRo11H4bN26MtWvX3lEPhs+a4SLlyDl+afz48Zg0adI9vzciMn8mDZySk5PVF9i8efMKtf3FixfRq1cvdO7cGYcOHVItCk8//TT+/vtvo5fVHPTo0UNNhb506RLWrFmj6unll19G79691SwfYypfvnyhVpW+n/d14cIFzJ49G19++SWmTZuG0tACYSwFJV+9cuUK/vrrLxVQ5rV9+3akpqbi0Ucfxffff3/Pry2fn+DgYGzatAkffvghjh49qoIM+Ty98MIL2dtJOVq3bq2Clp9++gknT55U/3p6euKNN964p9feuXMnBg0ahKeeekoFZdKqKZdjx47d9XkDBgxQQd23336rWuR+/vln1KlTJ/vxKVOmqM/Np59+ihMnTuDZZ5/Fww8/rF4jp/r166vPmuEidZrTkCFD1H3Hjx+/p/dHRBZAKyWkKCtWrLjrNhMnTtTq16+f676BAwdq3bt3L/TrxMfHq9eSf/NKTU3VTpw4of69V3q9XsvIyFD/lqThw4drffv2veP+jRs3qvf79ddfq9sXL15Utw8ePJi9TWxsrLpv8+bN6nZWVpb25JNPagEBAZqzs7NWu3Ztbc6cOXd9vY4dO2ovv/xy9nXZX85LUlKSVqZMGe3XX3/NtR855q6urlpCQkKh31f//v21pk2b3rXec5ZHpKWlaa+88opWuXJl9XotW7bMfr8iKipKe/zxx9XjLi4uWoMGDbTFixfneg3Z5wsvvKD2W6FCBa1Tp05qH/L+NmzYoAUHB6vntm7dWjt16lSu565cuVKV2cnJSatevbo2ffp0LTMzM/tx2cf8+fO1Pn36qPJNmzYt3/r48MMPtebNm+f72IgRI7RXX31VW7NmjTpmefn7+2uzZ8++4355rcaNG2ff7tmzp+bn56eOWV7yWRHJycmal5eX1q9fv3w/74btimrAgAFar169ct0XEhKijR49usDnyPv19PTUoqOjC9ymUqVK2meffXbH52jIkCEF1kNBOnfurE2ZMiXfx4rjO6SwdDqdFh4erv6lksN6t8x6v1tskJdZ5XEKCwtDaGhorvu6d+9u1LEsck5LzdQVPZ+Q/v5n2Lg42N33Prp06aJa9aQrTVrnCkO69qpUqaK6TCSflbQCPPPMM6hUqZL6Zf9f5LXkNeU50m0j3NzcVJfOwoULVYuIgeF2YVurpOVByiNdMkUxZswY1dKwZMkS1TW8YsUK1ZIlrSnSvSRdf9LKIt0wHh4eWLVqFYYOHaq6dlq2bJm9H2nJee6557Bjxw51W1olhHQTffzxxyqPj7RmPPnkk9nbbNu2DcOGDcMnn3yC9u3b4/z586puRM6WM+kqeu+991R3miRQzY/sq3nz5nfcL+Od5Hjt3r1bdS/Fx8erbeX1iiImJka1Lr3zzjvqmOVVtmxZ9a+08kZFRWHChAn57sewnXB3d7/raz7xxBP44osvsv/Gx40bd8ff+N268f/44w9VJx988AF+/PFHVe6HHnpIdR27uLiobaRVTLrocpLH8rYonT17Vn0+ZFtpTZs5c6ZKZpmTfB6kbomsVZZOj/QsvTo3pmbokKnTI1On/fuvHnpZQUOT8+Htc6KKSPJel/80qG3lPuS6L/e28khWViai4+PQ1sYVNSt6mPT9m1XgJOMuZOxOTnI7ISFBdVEYviRzki9MuRjItobgIO/YH7mtDta/F5GSkYX609bBFI7PeACujkU7RIZy5yQn0iNHjuR6X3mv57xPTtpyEjeQ8SISrCxduhSPPfZYga9neH65cuVgZ2enTpiG4yX3S/dL27ZtVYJACcIiIyPV+JX169fnW+6cXUKyLwlI5VjKshbS5ZL3OQW9H+nekgDt8uXL6qQoXnnlFRUgyDiYd999V90v9+UMtCQ4+OWXX9CiRYvs+yXIev/997Nvy3sRb7/9Njp06KCuS/Al3aPymZQT8IwZM9R9EjwJGacl423kvqlTp2bvS7qocnbB5Vcn8h4kwMv7mHRNSdmCgoLU7YEDB6puq3bt2t1RR3erNwkc5F/p5rrbMTGMYZLPVt56zytvd1heEqganit/45KANue+5LbcX9D+pQtXAiCpawnaJaCTLsXo6OjscU4SfM2aNUsFkhIMS7eebCv5lgz7lYBIPify3iUglmMk20twnTOwl8+uHIf8ymOo3/y+X4qb4fuqJMYwkmXWuwQ5txLTcSspHVGJ6YhKysi+HZeSiaT0LHVJTDNcMpGSoUOWRDYlqLpNOCa6zcUS+9aIyXoeL3s3KvbXKMrxNKvA6V7IL0Y5ceV169atOwYYy7gSqTw5QRvGBBl7bNDdqHIUchSa4Ys6v/IakvHlfV9536NsZ7j++eefq4H3V69eVQGAjOeRViTD43lfz3DCyPn6ecsjg/rlxC4nJxng/cMPP6iWozZt2hRYz7IPmSEngZKMiZNWGwns+vbtq54jJ0wZtGwg4+UGDx6cqzwyHk7eW84xL0KCMAnyZBt5XFp7li1bpoIheb+GVoqc77Fp06a5ymqoW3lfhvul1UnIfqS14vDhw6r1SQK0nM+Tz58E8oaEq3n3nZ+UlBQ1SDnvdhIgSOBluF+ud+3aVQULOU/6+X1GDCeCnJ+JnJ+F/Bjet/zNGK4X1Doqgfd/yVuneT9HebfJWxZ5bfm8yvgqIa1P0sI5d+5c9YPqo48+Ui2B9erVU9vKoPfhw4er5xj2K5MpDOR4SoAqE1CklXLkyJHZj8ngczkO+ZVH7pPyStBm7Mzh8jrSsijHjguKlxxzqfcsnYaIxAzcTMzAreQMRCZmIlICo6Tb/0YmZiAmJet2a899cLK3gYOtLRzsbGAvF1sb2Ml3gQ1ge/sf+XaATc7bMvlCBln/+5WhJmPcvvvff///dvOMMLTXf4Fo+0yMzQzDdTyvfnQXt6LMUjarwElm7ty8eTPXfXJbfrHm19pkmIGUs+lfTlQyQ0hObvK8nOREJpUnJ2ZDV0kZOzvV8lMUmZmy3ML9V21RuurkD1gu+XXxyGBZOVHIY4bMxtIiZNjW8MvZcJ+cKKQ1RE420l0hJ14ZJLxnz57s5+R9PcMspJyvn195pLtw/vz5eO2111TgJC0sdzvByD6ktckwu0mCLpkBJl1m0oIVEhKS3aIhJ3HpYjSsE2gojwR+8t5kGrn8m5PsW7aR9/rZZ5+pwecyk0u6e8aOHZu97qDhPRq2NzDsTz5/hvsN78fw/pOSklQLXv/+/e94f7I/w5evfB4L6qIzkM+tfGnn3E66IKWLbu/evapecwYUEggauktl/4bPd07yNyFda3K/1LO8T2l5ultZDMdDuh2lm+xux/C/umFlwLWhq07+xqXFKOdry48cub+g8khroZ+fX65lkho0aJC9YLe0xEkr0e+//67+xiWokee8+uqr2X8X+fHy8kLt2rVVi1bObWRZJjkO+T1P7pPjKWXJ2zVojBO4HCspS2k+gVua0lTv8hmXFqLzt5Jx/lZS9r+Xo1NwIz4NukK0DEnA4+XupC7eZRyzr5d1dUAZZ3u4O9qrf29fHODmZKfOTU72dnC0t4WdIfoxgoSoU9j1w2eIt9PQ2rMOHDrNQZeA+kap96L8vZpV4CQn8bxTk6WbR+4viPw6lEtehhN/3vtyTkUW8q+bU+EPkvrlbnv7C9QUWYTzvqbMjJKuBgkC5DHp9hByQjFsKy0ihufKRbrlpBUo5wwqOXnkt/+ct3PWmwRohi+YnGTckARlhtlNhnQDhX1fEqhIcCDBsJxwpbVGToz5rVVnKI+0dEkQISfggsb8yHuWViwpn5CyS3eUtDwU9B5zlivvZybv68u+pJz/9R7/qy6kVUpmr+XcTlqbpJsw7+xUCTLlMcN4KmlxO3DgwB2vIYGnPCb3ywlfurUkuJUZmXnHOUnQIEGWbCOBhQTU0oWb833n3E5Ii9/dSEBneK78LctnVj6vBhs2bFD3F1Q30v0r47ukRdIwnkoCP/l7lh9JOZ8nAa4E1xJkS1edjNkraL8S8EpgKJ+JnNvIjDo5Dvk9z3AM8/t+MYaSfC0yXb1nZOlxJSYZ5yINAdLtIOlCZBIS0wtuGXayt4VfWRf4ejqrSyX1rwt8PQzXnVHBzbFUZry/mngV++JPw7XeQwi1cYNbt7cRGR1ntHovyj5NGjjJF9O5c+dypRuQL1mZ2i5dHNJadP36ddUyIaSpXVoFpJtHBt/KF6x8actAXrrd9SQBkQQJ0hInY3ikq1LG2xjG18iJo1WrVqpbSsbaSJOnTNXOSU7wUucyxke2kQG30poh1wtLumckT5d0l0jgKidZIV1j0vIig4ofeOABdRIrKhlnJc+XQEHy6vwXaTWQIEvqQAZwy0lPgigZ59KoUSOV4kLes7TOSAAlZZQuLqlDw5ih+yHjmOQYyGdaBsLLH6gEqzLQXcZGFYUELNJqJ8dYgkgJAOT4yHgcaWXJSbaT9yEneplmL8GIBI4y8FuOgexDxkbJgGwJlAykXiUYkTE/sl+pIwlK5UeKdOFK4CYB1TfffKOOhUzrlyBL6lBai+RvUsaVSculkO6uwpL9dOzYUR0nOS6yD2kp/Oqrr7K3yfu9IF2zMhBcutOkW94waF2+Iwwt0dIiJ8+R1kr5V1oAJTiW7xID+SxJt690H0s3qwzclzqWbs+cZGB4zpxlRMUpIj4NB6/E4vTNRJy9mYSTEQm4Ep1S4LgiafCpVt4VNbzdUcPHHYFebqju5Qb/Cm7wKeMEWyO2CBmDPu4KjkQexlmkoVqZagju8Qns7RxK15gyzYQMU7nzXmQKupB/ZQp43uc0adJEc3R01AIDA7WFCxcW6TUtOR2Bof7s7e01b29vLTQ0VFuwYMEd0zflPcqUeZk6L3W5bt26XOkIZOq+TG2XKd5ly5bVnnvuOTXNPedU7bulIxBhYWFao0aN1PT7vB8zQ4qEpUuX3nOahZkzZ6r3aJgy/1/pCOSxqVOnqhQLDg4Oanr6ww8/rB05ckQ9LlPZ5XXc3d01Hx8fNd182LBhd32POT/DOaffS6oHuU9SPxisXbtWa9OmjapzDw8PlQ7hq6++KlI6DiEpDCRlguxPLFu2TLO1tdUiIiLy3b5evXra2LFjs2///fffWtu2bbVy5cplp1TYunXrHc+7ceOGSr0gKQzkb03SEzz00EO5UjiIPXv2qJQEcizkWNesWVN75plntLNnz2r3Sj4Xkk5BXlfSj6xatSrX4/l9L5w8eVJ93qV+q1Spoo0bN05LSUnJfnzLli2qLqSM8r6HDh2qXb9+/Y7UJvK5MLxfuX3u3Llc2+zcuVP9TeTcd05MR2D5inNafFxyhvbPmUjts01ntVHf79VC3tmg+U/6K99L0BtrtD6fbtP+t+Sg9unGM9rqIze00xEJWlpmlmYpUi5s1TbNqa0t+7yJdibi/1PmlLZ0BDbyP1gRGc8hA0hlnEh+Y5yk1UtaVu51fEJ+XUaUm7SQSOuH/Ko3jLm6X9ZU79IiJFPwS0PiV2uqd8NsRZkkkXMsWXF/hxSW/AKXFmPpfmdXXcm513qX7rbjN+Jx8EocDl2Nw5FrcbgUnXLHdtJAVNfXA/Ure6BWRXfU8fVAnYplUNHDyaL/xqIO/4SwDZMBfRZauwfAa8hvgMftWdAl8Xm/W2xg1mOcyLzJTCSZ5i3dhKNHjy62oMnaSN3JGCIZ6G2sbO10J5lpKRMHco6/IiqI5DeSLrfdF2Ow91IMDlyJRVrmnd1N/hVc0ahKWTSu4qn+lYDJzcmKTs16Pc6un4wjR39CBZ0OIf5d4NLvC8DZtLma7saKjg6ZmkwRl/E1MpBZxqnQvZHWHUm4SSVLAv284wGJDOJTM7H/cgz2XIzFnovROHo9XiWFzKmcqwOaVSuHJlXLopFc/DxRzs16f0Bmxl3B/hUjcDXmFGpnZqJhw6GwffBDwDb37OfShoETlRgZkJszsSYRkbmS5Mj7r8Rj94VobD8XpQKlvANfZPZaSGB5tKxeHiHVy6sB3Jbc3VYUCRkJCFv9PFKiT6FVli2qdv8YaHZ7ElNpx8CJiIioEOP5ztxMwtbTkVh//DoOX09CRp4WpYAKripIalm9ggqUqpRzYaCUj2uJ17A3Yi9cmo9E10Pl4NFlGuBV+Nm3psbAiYiIqIBxStvO3sLm05HYcvoWwuNzrzYhOZKkRaltDS+0reml8iJRwfRZ6Ti273OcLueHKmWqoLlvczjUfgTmhoETERHRv6KS0rHpZCTWn7yJ7Wejci3yLgklWwWWR7NKzniwaXXUrFiGLUqFlBZzAbtXjsSt2LNo1Goc6tTJve6pOWHgREREVu1qTAr+Ph6hLvsux+YaqyStSqH1fNC5rg9aBVaAo53Nv9PiOV6psKKOL0fYhklAZhI66hzgXa42zBkDJyIisrrxSpKZe9WRcKw/cROnInIv8CopAboFVURovYrqes4B3aUqg7UZOLfhdRw+tBDl9Xq08qwFl8e+B8oHwpwxcCIiIqtwLjIJfx25gb+OhKvrBrJQbXP/cujZwBfd6vuqVia6P1m6TOxfNwFXTixFrcxMNJJUA93fBRzMfxwYAycqFvKLbMWKFejXr1+J16isdl+vXj3s2LGjSOui0f0nhJR1AGU9uZCQEFYnlUqXo5NVoPTn4Ru5WpYc7WzRobY3Hmzoi851fKw6n1JxS8xIRNjRRUg+/itC0tJRreNrQPtXYCm4pLaFGDFiRPbK7A4ODmrJB1nAVJaAsHSSVPOhhx5SCwvntyiuLNQqixTn1alTJ/zvf/+74/7vvvsOZcuWvSMdvySdrFu3rlpKw9fXF6GhoVi+fLlq9jeQRatlsVlZvFgWN5bjIIvEykK192rLli1o1qyZ2p8EhlK+u7l06VL2ZyHnZdeuXbm2+/XXX7Pfj2TEXr16da7H5b3JQswVKlRQz5cFuPMmhHzllVcKXH6EyJQL5S7ccRH95u1Axw+34MO/T6ugyd7WBp3qeOOjxxpj3xuh+GZ4c/RvVoVBUzG6nnQdG69shL58ALq0n4Jq3T+wqKBJsMXJgvTo0QMLFy5EZmYm9u/fj+HDh6sT3vvvvw9LXsbl22+/xdq1a+947MqVK9i5cyfGjBmDBQsWoEWLFvf0GrK8Sbt27dQaRm+//bbaj2Tv3rp1qwpOu3TpogItCY66du2KBg0a4Msvv1RBiSyL8vvvv6sAQ7YvKln3rFevXnj22WexaNEibNy4EU8//TQqVaqkgsK72bBhA+rXr599WwIgA6kXCehmzpyJ3r17Y/Hixaq18MCBA6r8Ijk5Wb3vAQMGYNSoUfm+xpAhQzB+/HgcP348+3lEppCUnoU1R8Ox8tB17DwfnT3AW9Z+a1PDC70bVUL3+r4MkoxES43HsT+fxanqIfCr3BItfFvAoboDLJJmZe62AnJxrGyu1+u1jIwM9W9JkhXj+/btm+u+/v37a02bNs2+HRUVpT3++ONa5cqV1SryDRo00BYvXpzrObLq/IsvvqhNmDBBK1eunFaxYkVt2rRpubY5c+aM1r59e7XSvKw4v27dOlWnK1asyN7myJEjWufOnTVnZ2etfPny2qhRo7TExMQ7yvvOO+9oPj4+mqenpzZjxgwtMzNTGz9+vHptWaF+wYIFd33fv/76q+bt7Z1vvU+fPl2935MnT6r9513RXt7ryy+/fMc+Fy5cqLY3eO655zQ3Nzft+vXrd2wr70nKLK9bv359LTg4ON/Vu2NjY7V7MXHiRLXfnAYOHKh17969wOdcvHhRHY+DB3OvLp7TgAEDtF69euW6LyQkRBs9enSR9ifvu1OnTtrrr79eyHdk2YrjO6SwjL1avDnIzNJpm07d1F76+YBWd8oazX/SX9mX/vN3aN9su6DdTCjeY8F6v1Na1Dlt67zG2q8fVdJOfd5SKkkrbsau97vFBnmxxamwMpILfszGLveAN9lWby8Df/LZ1hZwcPnv/Tq64X4cO3ZMtSr4+/tn3yfddsHBwZg0aZJa/XnVqlUYOnQoatSogZYtW2Zv9/3332PcuHHYvXs3wsLCVDdg27Zt0a1bNzWjpH///qhYsaJ6XFph8nZ3SUuFtIa0bt1adZHJ1F1pJZGWn5zdTJs2bVJdWv/8848an/TUU0+pMstadrLvX375RS1oK68r2+Vn27Zt6j3lJd1n0vo2b9481fIjXVzLli1T77co5P3KGB5pWalc+f9X6jZwd3dX/x48eFC1ukjLTX4rd+fs+pNWoMuXLxf4mu3bt8eaNWvUdal/6RLMSeo2vy7GvKT7Uo65jEOSljG5bSD7lWOcd78rV65EUTVv3hzbt28v8vOI7oX8bR+/kYBl+6+pcUvRyRnZjwV6uaF/Mz/0beKHquVdWcElIPr4cuzaMBG6zBR0sCsLn97zgHy+Ay0JA6fCevfOk2a2Wg8AQ379/0qdUw82mSn5b+vfDhi56v9vz2kIpETfud30eBTVX3/9pU7kWVlZSE9PVyfwzz77LPtxPz8/1a1i8OKLL+Lvv//G0qVLcwVOjRo1wrRp026/tVq11D6ki0gCGOn+OXXqlHqeIZB499130bNnz+znS/AgJ+wffvgBbm63A0DZR58+fVS3oQRdonz58vjkk09UOevUqaMWAZauN8OYGVkI+L333lMn5ccffzzf9ywBSH4BjZRT9mXoznriiSdUl15RA6eoqCjExsaq4Otuzp49q/79r+2EjCWS7tSCuLj8f2AdERGRXV8GclvGXKWmpuba1kA+Ax9//LEKdqVuf/vtN9UNJ0GRIXgqaL9yf1FJ/d8tECQqDjHJGVhx8Dp+3Xc11yDvCm6O6NO4Mvo2qawWz+VacCVEr8f5DZNx6MiPKKvXo7VnLbgO+Ako9/8/1i0VAycL0rlzZ3z++eeqxWf27NlqHM4jj/x/OnudTqeCHAmUrl+/rmZFSYDl6pr7l5kETjnJeBppNRInT55E1apVcwUr0rKUk2zTuHHj7KBJyElcWm9Onz6dfcKWlpecrTNyf85xMjKoW8blGF47PxI8yODmvGRM08CBA1UdCBnPM2HCBJw/f161sBVWzoHfxbGdyNkKaAxeXl65WpNkTNaNGzfw4Ycf5mp1Ki4SvEmQSlTc5O9q98UY/LznCtYcjUCG7nYOJUd7WzwQVBGPNKuCdrW84GBn2S0cpU1WehIOLhuMSxH7UCMzC42DBsCu18eAvROsAQOnwnrtxt276nLI+t9JdcK2KairLqf/HUVxkUDFMB1fAgcJXqSVRbrAhJw4586dizlz5qhZVLK9dPlIAJWTzMrLVWQbG6MkfcvvdYr62hIkSItQTjExMSo1grTqSCCZM3CUepFZeEK6K6WrMb/B4J6enuq6t7e36maTVra7ke4wIds1bdr0rtsWpatOZu/dvHkz1+NyW8qeX2tTQSRdwPr167NvF7Rfub+opL6lnoiKS2xyBn47cA2L91zBhVv/P5yhgZ8HBjavqlqYyroyfYApJGUkISx8J5KQiZbpOvj3nA00HQJrwsCpsIoy5ki2lZaO/AKn+9lvEUhLjnR5ScvD4MGD1UlWxhH17dtXdVsJCUjOnDmDoKCgQu9X8iVdvXoV4eHhqiVK5J3mLtvIWCZp+TK0OslrG7rkipMEKT/99FOu+2T2mYyJyjteZ926daoL680331StWVIWuS8vmVlmCISkzNJN+OOPP6ruy7zdgklJSarFq0mTJqoeZf/S0pV3nJMEY4ZxTkXpqpPWvLxpAiQAytvK918klYDheBn2K92vOcdK3ct+hYzt+q9gkagwrUt7LsaoYCln65Kro53qhhvc0h8Nq9z+QUOmEZ5wFbsj98PJzgldHpwHz+RYoMqdY0wtHds3Ldhjjz2mAgQZIG0YryQnRxmALd1pMvA6b6vDf5GByhJUSKqDw4cPq8HZkt8oJxlILcGEbCOD1Ddv3qzGU8n4orzjau6XjGGSE3fOVidpZXv00UdVt1/Oi7S8yZglQ+qC5557TgWOL730Eo4cOaK6EWfNmoWff/5ZpQ8wkBYq6Z6UVhsZt3XixAk1pklaryRgkOBJWsZkMLrsT1qMJNi5cOGC2q88XwLWnF110jJY0EXGohlIGgLZjwzultas+fPnq67WsWPHZm8j48ckDULOwf3yHmR7uUj3rJRVjoHByy+/rOpBAj3ZZvr06Sqdggzgz9mSJAGXvF8h9SO3846DkqBYxr8R3WsagR93XUaPOdsw8Ktd+P3QDRU0yVIn7zzcAHteD8XM/o0YNJmQlpmG48tHYPuq5+Dt7IXQaqHwLBdolUGTolkZa0pHIGbOnKmm6yclJWnR0dFqG3d3d5UCYMqUKdqwYcNyPS+/KfryuOzf4PTp01q7du00R0dHrXbt2tratWvvOR1BTvm9tr+/vzZ79uy7vveWLVtqn3/+uar3vXv3qrLs2bMn32179uypPfzww9m3Zbtu3bqpOpIUBDIlP+f7MIiLi9NeffVVrVatWup9S5qG0NBQtW3OYy11I3UqKR9kOyn/oEGDtAMHDmj3avPmzVqTJk3U/gIDA1W6hJwkXYS8jsF3332n0kS4urpqHh4eqn4kbUNeS5cuVcdP9ispD1atWpXrcXkdqcu8l5zpKXbs2KGVLVtWS05Ovuf3Z0mYjqDwzt5M1N5YeVSrP3VtdgoBSSkwadlh7fDVe0vfURKsLR1BWsxFbevXbbWlH1XSTrzjpemv5P/damylKR2BjfwPVkRmI8n4FRnbIuNEcpKZYJJwULI95zfguDCkOmVWW4FjnKjYSVoFGfgtKQEkmzXrveRIt6S05k2ZMoX1XkzfIYUlXe0yccLHxyffFBilkXw/bj8XhW+3X8SW07dypRF4opU/HgmuAk+X0p000Rzr/V7Fnl2HnWtehC4jES31DvB9+FugVu70KJZS73eLDfLiGCcye5JZW7rIZKagnLCoZMikAgmapNuP6G7SMnX4/dB1LNh+Cadv3k4lIL8ru9atiBFtAtC25u1lfaiU0DRc2PERDu35BJ5ZmWjtXh2ujy8CygeaumSlAgMnsggyyFla+qjkSOuetDSx3qkgkYlp+CnsMhbtvpKdqFIGew9oXlUFTAFexpkcQ/dOp9fh4KoXcPHMHwjMyEKTGg/Crt88o01kMkcMnIiIqFgdvxGvWpcks7dhdpxfWRcVLA1oUbXUd8dZq+TMZITdCEOCdyCaH9NQvf1koO3/CjdD3IowcCIiovum12vYeCoS326/gF0XYrLvb1atLJ5qF4ju9SvCnokqSydNQ8SNPdidGgEHWwd0bjIK5YKGAh7/n8KE/h8DJyIiumfJ6Vlq3biFOy7iUvTtDPJ2tjZ4sGElPNk2AE2rlWPtlmJaWiJO/vksjl/fAd9uMxFS9xE42jkCxp3bYNYYOBERUZFdj0vF9zsvqeVQEtNujy/0cLbHoJBqGN46AJXLFj6zPZlGRvhh7Fk5EuEpEQjKyEJQWiZsJGiiu2LgREREhXbsejy+3nYBfx0Jh05/O5tNdS83jGwboNaOc3PiacUcxO37FmFb30SGlon2dh7wHbwA8G9j6mKZBX7CiYjoPx24Eou5G85i65n/z7/UpkYFPNWuOjrXkdw6HEBsFjQNlza+gQOHFsBDr0cH31Zwe+QbwM3L1CUzGwyciIio0AGTjF/q1bASnukQiAZ+XDvO3FINHN75Mc4fXoCArCw0C34edl2nyqKcpi6aWWHgRCYna53JmmyyZpoks8y7OC8RlY6A6ZFmfhjTuRaqVXDlITEzKZkpCAsPQ5xXAIK9myEw6BGg+UhTF8ssMXAikxs3bhyaNGmCNWvWwN3d3dTFIbJqey7G4NNNZ7HtbJS6zYDJzOl1uLlzLnZ7VYGdoxs6B4SifJ3HmJvpPjBwIqOtSaXT6dSaff/l/PnzqsWpSpUq97X8h6WvG0VkTGHnozF345nsHEz2tjbozxYms6bFXcOp5cNxPPYkfGr3RsiDn8LJzsnUxTJ7PNNYgB9++AEVKlRAenp6rvv79euHoUOHFnl/0mXm6uqKxYsXZ9+3dOlSuLi44MSJE/k+Z8uWLWqtKWk1Cg4OhpOTE7Zv364WZpw5c6ZaQ06e37hxYyxbtkw959KlS+o50dHRePLJJ9X17777Tj127Ngx9OzZU7VAVaxYUb2PqKjbv4BFp06dMGbMGLXUipeXF3r06FHo57300kuYOHEiypcvD19fX0yfPj3Xe4mLi8Po0aPV82WhVlmP7a+//sp+XN5X+/bt1fupWrWq2l9ycnKR65moNNh/OQaDvtqFQV/vUkGTg50NBodUw+bxnfDBo43ZLWemMs/8jZ3fd8Wx2JOop7dHe7/2DJqKi2Zl4uPjZf6s+jev1NRU7cSJE+rfe6XX67WMjAz1b0lJSUnRPD09taVLl2bfd/PmTc3e3l7btGmTuv3PP/9obm5ud7389NNP2c+fN2+e2ufly5e1q1evauXKldPmzp1bYBk2b96s6rVRo0baunXrtHPnzmnR0dHa22+/rdWtW1dbu3atdv78eW3hwoWak5OTtmXLFi0rK0sLDw/XPDw8tDlz5qjr8l5iY2M1b29vbfLkydrJkye1AwcOaN26ddM6d+6c/XodO3bU3N3dtQkTJminTp1S20VGRhbqefJ606dP186cOaN9//33mo2NjSqz0Ol0WqtWrbT69eur+6TMf/75p7Z69Wr1uLwvqavZs2er5+/YsUNr2rSpNmLECM0ameLzXpoVx3dIYclnVf5m5N97ceRqnDZ8wW7Nf9Jf6lLrtdXalBVHtWuxKcVeVktyv/VeEuIO/aSt/rCytuLDStqN+S01Lfq8Zu50Rq73u8UGednI/2BFEhIS4Onpifj4eHh4eOR6LC0tDRcvXlStI9LSYJClz0Jixu0Vvf+LVKcseipdVPe72ncZxzKwty1cb+rzzz+vWnBWr16tbs+aNQvz5s3DuXPnVDlSU1Nx/fr1u+5DWljKlCmTfbt3796qvmQxVzs7O6xdu7bA9yQtTp07d1YDu/v27avukxYwadXZsGEDWrdunb3t008/jZSUlOwWrbJly2LOnDkYMWKEuv32229j27Zt+Pvvv7Ofc+3aNdW6c/r0adSuXVu1HEnZDhw4kF3vb775Jnbu3Pmfz5MuRNm/QcuWLdGlSxe89957WLdunWqxOnnypNo+Lym71MWXX36ZqwWqY8eOqtUp5+fGGhTn590SFPQdYgzSmhsZGQkfH0kFUPjOg9MRiZi9/gzWHo/IHsM0oHkVjOlSS60nR8ap9xKh1+HK+snYf/QnuOv1aF29O9z7fQk4mP/3kt7I9X632CAvjnEqBAmaNlzZULja125P+bSztQPu8zwSWi0U5ZwLt1zBqFGj0KJFCxUc+fn5qS4vCUQMJzPpVqpZs2aRXn/BggUqeJAP6fHjxwt1YmzevHn2dQnaJEDq1q3bHeORmjZtWuA+Dh8+jM2bN+c7UFzGQxkCGukSzOnIkSOFel6jRo1yPVapUiX1BykOHTqkxlrlFzQZyiavs2jRolzBg/xRywmzXr16Bb4vIlO6GJWMORvO4I/DNySVj1q3tV8TP7zctRYCvLjyvbnTa3ocvrQR506vUKkGmjYaDvse7wFyLqJixcCpkC0/EsSYosWpsCQQkfFDMt7pgQceUIHOqlWrsh+XFhZpSbkbaUUZMmRIriBBWlEkcAoPD1cBxn9xc/v/L+CkpCT1r5RDgrmcZAxUQeR5ffr0wfvvv3/HYznLkPO1hJS1MM9zcMi9MrscJwl8DAHm3UjZZPyTjGvKq1q1and9LpEpXItNwacbz2HZgWvZmb4fbOiLsaG1Uati4b9jqHSnGtgVvguxWYlo1nkGamgOQKPHTF0si8XAqTCVZGtf6JYfU3ZdSDeSdHlJq1NoaKjqosrZEiStKf/VVWcQExOjWqxef/11FTRJQCXdYv8VWOQUFBSkAqQrV66orqzCatasGX777TcEBAQUalaegaQ0kK7Coj4vJ2mNku69M2fO5NvqJGWTAfJFbb0jKmnRSen4dNM5LNp9GZm62wFTl7o+GNetNhNXWgpNQ+TWd7E7Ixo2NbugU9VOqOBSwdSlsngMnCzI4MGDMX78eHz99deq5SmnonbVSXoACbymTJmixipJi5bsW8ZNFZaMl5LnjB07VrXotGvXTvUfS8JL6UMePnx4vs974YUX1HsYNGhQ9uw36fZbsmQJvvnmGzXGKD/PPfec6l4s6vNykgCvQ4cOeOSRR9Q4MakzmWUoQbDM3Js0aRJatWqlZvRJoCqtXhJIrV+/Hp999lmh64bIWFIzdPh2+wV8sfUCktJvL77btmYFjOtWB8H+hfsBSGYgNRanV47C0Rs74e3giVYdZ8DJiS2IJYGBkwWRgW1ywpeuMUlFcK8k6JJB5gcPHlQtN3L56aefVOAjA8b/q8svp7feegve3t4qJcGFCxfUQHBptXnttdcKfE7lypVVcCVBinQ7SuDm7++vApe7DQqU58lA7VdffbVIz8tLWrsk4JMATLr/JHiSgeOGFqmtW7eqljhJSSAtjDVq1MDAgQMLvX8iY5BuuGX7r+LjdWcQmXg7NUkDPw+82qMe2tXiOmSWJPPCVuxdPQbX06NRN1ND/Q7jYevI5MElhbPqinlGjKlnGXXt2hX169fHJ598Amti6nq3Vqz30jGrbvfFWLz51wmcDE9Qj1Ut74LxD9RBn0aVufiuJc2q0+sRv+0DhO2ZizQbG7RwrAC/hxcAVVvA0uk5q46KW2xsrEoJIJf58+ezgomswLW4dExddwDrTtxUt8s426tZckNb+8PJnrOpLIouC1eXPIZ94bvhpmnoGtgTZXrNAZzvPnWeih+76iyEjEGS4ElmlNWpU8fUxSEiI4pPyVTryX238xKy9JrKxTQkpBr+F1ob5d0cWfcWmGrgSPQxnHV1RTXNFsFd3oF98AiuN2ciDJwshCS/JCLLlpGlx0+7LuOTTWcRl5Kp7utQywtv9A5iagFLpGlITY3BrqgjiEmLQZMOU1Cr7RuAT11Tl8yqMXAiIjKDsWQbT0bindUnVSJLUdvHHc+29kW/kFqlL4M13b+MZET9+SLC4s8A7cejY9WO8HLhIP/SgIETEVEpduZmIt766wS2nb29WLWXuxNeeaA2HmlaGTHR/7+ANVmQm8dxZvkIHEm5Di+dhhC4wYVBU6nBwCkfhizSRERFUZxLf8alZKg15X7afUWlGnC0s8WT7arjhc41UMbZgd9TFirz6DLsW/M/XLPTo7adBxr2/wq21TuYuliUAwOnHGQxW2nyvnHjhso9JLeLOrWd07NNg/XOei8Nn8Fbt26p74y8y/oUhV6vYfGeK/ho3enscUwPBFXE673qwb8C15SzWJqGhM3vYOf+eUi1tUGr8g1Q9bGfAHcfU5eM8mDglIMETZJ/RZYYkeDpXhgWfJV9MZ9QyWG9mwbrPTf5m5dFoguTpT4/J24k4LUVR3Hoapy6XadiGUztE4S2NTm2xdJd2/A69h7+DrKoVde6A+EhC/TaF7ymJ5kOA6c8pJVJFmuVZIo6na7IFSpBU3R0NCpUqMABmyWI9W4arPfcpKXpXoKm5PQszNlwBgt2XFLdcu5O9moc09BW/rC348BvS081cDTqKM54VUUVp/Jo3uJ5OLR63tTFortg4JQPQ1P7vTS3y4lEnidZgznTpeSw3k2D9X7/1p+4iWm/H8ON+DR1+8GGvpjauz58PY2beZxMLy3qDHanhuNW6i00DuiK2g1GAg487qUdAyciIhO4EZeK6X8cz876XaWcC97q2wCd63JMi8XT6xC1dSbCDnwBtHgaHZu/AG9Xb1OXigqJgRMRUQnK0ulVxm+ZMZecoYO9rQ2ebh+olkpxceQyKRYvKRJnlw7GkZgTKK/Xo1V8DFwYNJkVBk5ERCXk8NU4Nfj7+I3bi/EG+5fDOw83QF1frjdmDbLCj2L/b4NxJT0ateCARqFvw7bpE6YuFhURAyciIiNLSMvER3+fxo+7Lsusc3g422Pyg/UwsHlV2NoWLeUJmafEU38hbM1LSNaloZWTN6oOWgZ41TJ1segeMHAiIjJiuobVRyMw48/jiExMV/c93NRP5WSSDOBkHa5f3Ym9q56Fs16HLl6N4TlgEeBWwdTFonvEwImIyAiuxqTgjd+PYcvpW+p2dS83Nfi7XS3mZLKmVAPHo47jVOp1+NXuhRY6Wzj0nsP8TGaOgRMRUTHK1OnxzbaLmLvxDNIy9WqplGc71cDznWrA2YGDv61FevgR7I4/g0hboJFXI9Sp+YhkWTZ1sagYMHAiIiomx2/EY8KvR3Ai/Pbg71aB5fHOww1Rw9uddWwtNA3R+75C2LZ3ofesgg4DfoVPmcqmLhUVIwZORET3KT1Lh882ncPnW84jS6+hrKsDpvQKwiPN/Lj0kjVJT8T5P57DocubUFavR2v7CnC1l0VUyJIwcCIiug+yrtzEZYdx5mZSdubvGQ81gHcZDv62JllR53Dw14G4lBKOmll6NG4zHrbtxrF7zgIxcCIiugdpmTqVxPLrbReg1wAvd0e82bcBHmxYifVpZZIubsXO359EclYqWtp5wn/QAsC/tamLRUZi8pFq8+bNQ0BAgFrbLSQkBHv27Lnr9nPmzEGdOnXg4uKCqlWrYuzYsUhLu73GExFRSdh3KQY9527Dl//cDpr6NamMdWM7MmiyQuGJN7Bh02ToMlPQxbM2/J/ayKDJwpm0xemXX37BuHHj8MUXX6igSYKi7t274/Tp0/DxuXO9psWLF+PVV1/FggUL0KZNG5w5cwYjRoxQYwhmzZplkvdARNbVyvTxutP4ZvtFlciyoocT3unXEKFBFU1dNDJBji5JNXAi5gQqtxuPlsf/hkOvjwEnTgSwdCYNnCTYGTVqFEaOHKluSwC1atUqFRhJgJTXzp070bZtWwwePFjdlpaqQYMGYffu3SVediKyLkevxWPs0kM4F3l7LNOjwVXwRu8geLo4mLpoVMJ0cRex/czXuOVbDw29GqJOuTqwqdOfx8FKmCxwysjIwP79+zF58uTs+2xtbREaGoqwsLB8nyOtTD/99JPqzmvZsiUuXLiA1atXY+jQoQW+Tnp6uroYJCTcnias1+vVpbjJPuWXiDH2Taz30sYaPu+yKK90yc3deE7NmJOxTDMfboCu9W63MpnqvVtD3ZdG0ceW4aB0zWWlom23D1GxbG11HORC5vt5L8p+TRY4RUVFQafToWLF3E3ccvvUqVP5PkdamuR57dq1UxWYlZWFZ599Fq+99lqBrzNz5kzMmDHjjvtv3bpllLFRUvnx8fGqfBIIUslgvZuGpdf7tbh0zPj7Io6GJ6vbnWuWxatd/eHpYoPIyEiTls3S677U0TTc2vcxTp/+GeX0OgS71USGa12Tfw6shd7In/fExETLnFW3ZcsWvPvuu5g/f74aE3Xu3Dm8/PLLeOutt/DGG2/k+xxp0ZJxVDlbnGRQube3Nzw8PIxycGXMleyfX2Ylh/VuGpZa7/LlvHT/Nbz910kkZ+jg7mSPGQ8FqUHg8n5LA0ut+9IoKzMVh/8ajUuXNiEwMwu1Ah+Ga99ZsHVkjiZL+bzLBLVSHzh5eXnBzs4ON2/ezHW/3Pb19c33ORIcSbfc008/rW43bNgQycnJeOaZZ/D666/nW5lOTk7qkpdsa6wvGzm4xtw/sd5LE0v7vEclpWPy8qNYf+L2d1PL6uUxa0BjVCnnitLG0uq+NEpOuomw5U8gIfI4WqRloFq3dxDp3xfuji6sdwv6vBdlnyb7a3N0dERwcDA2btyYK6KU261b55//IiUl5Y43J8GXYP8yEd2vjSdvosecf1TQ5GBng8k96+LnUa1KZdBExheRHIENu2ch4+YxdMkEAh79AWgxilVv5UzaVSddaMOHD0fz5s3VYG9JRyAtSIZZdsOGDYOfn58apyT69OmjZuI1bdo0u6tOWqHkfkMARURUVMnpWXh71Un8vOeKul27ojvmDGyKoMrF351PpZ/8ED8ZcxLHo4+jUt0+aOlUGY4BbYEqzeUXvqmLR9YcOA0cOFAN0p46dSoiIiLQpEkTrF27NnvA+JUrV3K1ME2ZMkU11cm/169fV32dEjS98847JnwXRGTuS6b8b8lBXIpOUbefblcd47vXgbMDf4xZowxdBvYcW4xwO6B+pZaoV74ebPzambpYVIrYaFbWxyWDwz09PdXofGMNDpdZFpLAk+MOSg7r3TTMud71eg3fbr+I99eeUmkGKnk64+PHGqNNTS+YA3Ou+9IqNjUGYbtnI/PADwjxrAnfYX8BDrkHgLPeTcPY9V6U2MCsZtURERWHmOQMjP/1MDadisxemHfmw43g6cpkltbqUvgBHNg0BR7XD6BjahrcvD0Bvc7UxaJSiIETEVmV3Rei8fKSQ4hISIOjvS2m9g7CkJBqpSbNAJUsnV6HQ+dW4cL6yaieeAtNM3Sw6zgZaD8OsGMgTXdi4EREVkGn1zBv8znM2XBGLcwb6OWGzwY34wBwK5aSmYKwvZ8ifu9XCE6KQ6BrZWDYj0DlJqYuGpViDJyIyOJFJqThf78cws7z0ep2/2Z+eKtvA7g58SvQmlMN7LmxG3Ynf0fn+BiUq9QUeHwxUCb/PIJEBvzWICKLtvXMLYz75RCikzPg4mCHt/o1UAv0knWS+VCnYk7hWPQxVHStiJC+38Hp0M9Ap8mAvaOpi0dmgIETEVmkTJ0es9afwedbzqvbdX3LqK65mj7upi4amUimLhN79nyCG9GnEdTqZQRVCLo9ti10Go8JFRoDJyKyONdiU/DSzwdx4Eqcui2Dv9/oHcTcTFYsPikCO9eNR/qFTWiXmoZKDUcCXpwQQEXHwImILMq64xGYsOwI4lMzUcbJHu890gi9GlUydbHIhC6fXYsDW6fBPeYy2qemwb31i7ezgBPdAwZORGQR0rN0mLn6FL7beUndblzFE58OaoZqFbjOnLXSa3ocPvgNzm19BwHpqWhq4wb7QT8AtR8wddHIjDFwIiKzdzUmBS8sPoAj1+Kzl02Z2KOuytNEVpxqYM9cxO2ej+DUFARW6wD0nQd4VDZ10cjMMXAiIrO2+VSkSjUgXXOeLg6YNaAxuta7vd4lWafIlEjsCt8FW30GOiUloULdh4D+X3PWHBULBk5EZLYJLSWZ5aebzqnbjauWxbzBTVGlHLvmrNnpmNM4GnUUPq4+CGnzKpy8mgF1HgRsuWgzFQ8GTkRkdqKT0tWyKdvPRanbw1r74/Ve9eBkz5OjNaca2Hv0R1w/tgR1201GA7/2t1MN1Otj6qKRhWHgRERm5cCVWDz/0wG11pwktHzvkYbo28TP1MUiE4pPi0fYljeQdvw3tElNhV/Wx8DwTjwmZBQMnIjIbDI+L95zBdP/OI5MnYZAbzd88UQwalcsY+qikQldvXUM+9aOhVv4UXRNTUOZhgOBnu8DXLSZjISBExGZRaqBqSuP45d9V9XtHvV98dGAxnDnWnNWnWrgyJEfcXbbTFRLikFwJmD/4EdA86cYNJFRMXAiolItMjENz/64X2UBt7UBJnSvi2c7Bt4ev0JWKTUrFbv2f4mY7R+iaVo6anoEAP2/BPyCTV00sgIMnIio1DpyLQ6jf9yP8Pg0eDjb49PBzdCxtrepi0UmdCvllko1gHLV0NEtAF6BDYFeHwOObjwuVCIYOBFRqfT7oeuYuOwI0rP0qOHthm+Gt0B1L54crdnp63twNOkyvCXVQKUQOPt3A5w4xo1KFgMnIip1+Zk+/Ps0vth6Xt3uUtcHcx5vAg9nB1MXjUwkU5+JfQcX4Nr2D1GnVi806DELtpKXyd6Zx4RKHAMnIio1EtIy8b8lh7DpVKS6/WzHGpjQvQ7sZHATWaWE9Hjs3PY2Ug//jNapKahy/SiQmcKWJjIZBk5EVCpcjErG09/vxflbyXCyt8UHjzZifiYrdzX2AvZtfBWuF/5B17Q0eDQYAPT5BHBgSxOZDgMnIjK5f87cwpjFB5CQlgVfD2d8NSwYjaqUNXWxyISpBo5e3Y4zG99A1ZsnEZyWAYfQ6UDbl5lqgEyOgRMRmTSp5bfbL+Ld1Seh14Cm1criyyeC4ePBFgVrTjWw+/pORK+dgMZRV1DbxhkY9ANQp4epi0akMHAiIpMltXx9xTEs239N3X40uAreebgB15uzYlGpUQi7Eaaudwx+AV47PgUG/Qz41DN10YiMEzjJr0cmpSOi/xKZkIbRP+3HwX+TWr7eKwhPtg3g94cVOxtxCEeubUN5nwZoVbkVXGq4AI0HAw4upi4aUS62KKIRI0YgOTn5jvsvXbqEDh06FHV3RGRlDl+Nw0Of7VBBk6eLA75/siWealedQZOVytJnYfelDTj0+1OosWM+OpavDxf7f4MlBk1kCYHT4cOH0ahRI4SF3W5OFd9//z0aN24MLy+v4i4fEVlYUssBX4YhIiENNX3c8fsLbdG+FjOBW6vEjERs3D8fN9ZOQKtbF9EkS4Nt8i1TF4uoeLvq9uzZg9deew2dOnXCK6+8gnPnzmHNmjWYNWsWRo0aVdTdEZEV0EtSy3Wn8fmW20ktu/6b1LIMk1paresJV7Fny1S4nF6LLqmp8HT1AYauBHzqmrpoRMUbODk4OODDDz+Eq6sr3nrrLdjb22Pr1q1o3bp1UXdFRFYgLVOHcUsPYfXRCHX7+U418MoDTGppzakGjl/ZhlNbpsEv/CRapKXBoelQ4IG3AJdypi4eUfF31WVmZqqWpvfffx+TJ09WAVP//v2xevXqou6KiCxcVFI6Hv9qlwqaHOxsMGtAY0zsUZeZwK1UWlYatl3bhtP75qPRtWNoo7OFQ/+vgb6fMWgiy21xat68OVJSUrBlyxa0atVKzaT74IMPVPD05JNPYv78+cYpKRGZlbM3EzHyu724FpuKsq4OKj9TSGAFUxeLTCQ6NRph4WHqnNGh6wfw0WYAnV4DvGrymJBltzhJ4HTo0CEVNAlJPzBp0iQ1WPyff/4xRhmJyMzsOBeF/p/vVEFTQAVXLH+uDYMmK3b+9B/Ysuk1uNo5o2u1rvAp6w88uoBBE1lHi9O3336b7/1NmzbF/v37i6NMRGTGlu69itdWHEWWXkNz/3L4alhzlHdzNHWxyASystJxYMNkXD6+FLUy0tEoVQfbR0N5LMh6E2CmpaUhIyMj131OTk73WyYiMtOZcx+tO435/86ce6hxZbVQr7ODnamLRiaQlByJncuHIvnmUYSkpaNa7d5Aj3d5LMj6AidJfildc0uXLkV0dPQdj+t0uuIqGxGZifRMHcb/dhSrjoSr2y91qYmx3WozqaWVunH2b+zZOAlOiTfRJUMPz15zgaZPmLpYRKYZ4zRx4kRs2rQJn3/+uWpd+uabbzBjxgxUrlwZP/zwQ/GUiojMRnxaFoYu2KuCJpk599FjjTHugToMmqyQDPw+tvdz7PjzafjEhyPU1hOeg5YyaCLrbnH6888/VYAkCTBHjhyJ9u3bo2bNmvD398eiRYswZMgQ45SUiEqda7EpeOaX07gcm4Yyzvb4cmgw2tTgCgLWKF2Xjt3huxHp6ICGejvU8e8Gm76fMs0AWZwiB04xMTEIDAxU1z08PNRt0a5dOzz33HPFX0IiKpWOXY9X6QZuJaajkqczvhvZEnV8y5i6WGQCMZHHEJZ0GTpNh/Y1+6Bilc5A+UCZds3jQRanyF11EjRdvHhRXa9bt64a62RoiSpbtmzxl5CISp1/ztzCwC/DVNBUw8sFy55txaDJGun1uLB2Ajb/1BPOEccQWi0UFd0qAhVqMGgii1XkFifpnpOFfjt27IhXX30Vffr0wWeffaYyist6dURk2X7ddxWTl99ON9A6sDze6l4NlTz/Xc2erEZWehIOrhiBS9d3okZGFhrfugo7B1dTF4uo9AVOY8eOzb4eGhqKU6dOqfxNMs6pUaNGxV0+IipFA38/23QOH68/o273bVIZ7/VvgPiYO2fXkmVLTriOsN+GICH6DFpk6BDQ61Og8UBTF4uo9OdxEjIoXC5EZLmydHq88ftx/Lznirr9bMcamNi9joRTpi4albCIG/uw+49n4JgYji46B5R9/GcgsCOPA1mNewqc9u7di82bNyMyMhJ6vT7XY+yuI7IsKRlZGLP4IDadilRjfWc8VB/DWgdkJ70k62lxPHHzAE6sGI5KSVFoaV8ejsOWARXrm7poRKU7cHr33XcxZcoU1KlTBxUrVsyVqyXndSIyf1FJ6Xjqu704fC0eTva2mPt4U/Ro4GvqYlEJy9BlqFQDESkRqB/QGfUu7obNsN8Bj8o8FmR1ihw4zZ07FwsWLMCIESOMUyIiKhWuRKdg2ILduBSdgrKuDvh2eHME+5c3dbGohMVe2oaw9Ahk2jmivV97+FbvA2h6wJEDwck6FTlwsrW1Rdu2bY1TGiIqFU6GJ2DYgj0q3UCVci74/smWqOHtbupiUUnSNFzc8REO7v0MHm4V0fGxX+HmxtZGItt7mVU3b9481hyRhdpzMQYD/s3RVNe3DH57rg2DJiujS43DviUPY9/u2aiWnorOLlXh5lLO1MUiMs8Wp/Hjx6NXr16oUaMGgoKC4ODgkOvx5cuXF2f5iKgErT9xE2MWH0B6lh4tAsrhm+Et4OmS+2+cLFtK7EWELRuE+PgrCM7QIbDTVKDVc4CtnamLRmSegdNLL72kZtR17twZFSpU4IBwIgux9N/Eljq9htB6PvhscDM4O/BkaU0izq7F7rVj4ZAWh86aM8oN+xWo0tzUxSIy78Dp+++/x2+//aZanYjIMnyx9TzeW3NKXX80uAre698Q9nZF7sknM041cCr6JI5tfwe+KTFo6R4Ap0E/315vjojuL3AqX7686qYjIvMneZjeW3sKX/1zQd0e3SEQr/asy5ZkK5Kpy8SeiD24kXwDQV3eQtDh5bDp/i7gxMkARMUSOE2fPh3Tpk3DwoUL4erK6ahE5ipTp8ervx3FbweuqduvPVgXz3TgjyJrEnd5B8LOrEB69XZoV7kdKrlXAvw7mbpYRJYVOH3yySc4f/68Sn4ZEBBwx+DwAwcOFGf5iMgI0jJ1ahD4hpORsLO1wfuPNFJddGQ9Lu2chYO75sA9KwPtKwbDXYImIir+wKlfv35FfQoRlSJJ6VkY9f0+hF2IVtnA5w9phq71Kpq6WFRCdCmxOLzqBZy/sgUBmVloWq0L7Gt2Z/0TGStwkm46IjJPcSkZGL5wLw5fjYO7kz2+Gd4crQIrmLpYVEJSos8jbNnjiEu8juD0DAS2nQB0nCjrZfEYEBlzkV8iMj+RCWkY+u0enL6ZiHKuDiobeKMqZU1dLCohkRc2Yteal2GXEo3OtmVQfuiPQLUQ1j9RETFwIrICV2NS8MS3u3E5OgU+ZZyw6OkQ1KpYxtTFohJyKuYUjl3dDJ+kWwhx9YPT0JVA2Wqsf6J7wMCJyMKdi0zC0G93Izw+DVXLu2DRU61QrQJnxFqDzLQE7I0+huvJ11Gv/uOoX64xbGp2BVy5WDPRvWLgRGTBjl2Px/AFexCdnIFaPu748akQ+Ho6m7pYVALiz65D2LpxSAsZjbZ1H0Fl98qAVwPWPdF9KnRq4Pbt2+Ojjz7CmTNn7vc1iagE7LsUg0Ff71JBU0M/T/wyujWDJmugy8KVvydi0+8jYZschdBbV28HTURUsoHTqFGjEBYWhuDgYNSrVw+TJk3Cjh07VKp+Iipdtp29pQaCJ6ZloWX18lg8KgTl3RxNXSwyMn38NRz6/gHsPvYT/LIy0aV2f7h3mcp6JzJF4DRs2DC1Rl1UVBQ+/vhjxMXF4bHHHoOvry+efPJJrFy5EqmpqUUuwLx581QiTWdnZ4SEhGDPnj133V5e94UXXkClSpXg5OSE2rVrY/Xq1UV+XSJLtenUTTz1/T6kZurQqY43vh/ZEmWccyeqJcuTenoNtnzXBedjTqGpzh4te38O+37zuXQKUTEr8iqeEqw8+OCD+PLLL3Hjxg388ccfKoh54403UKFCBfTu3Vu1RBXGL7/8gnHjxqncUJJxvHHjxujevTsiIyPz3T4jIwPdunXDpUuXsGzZMpw+fRpff/01/Pz8ivo2iCzS2mMRGP3jfmRk6dG9fkV8NbQ5XBztTF0sMrJbhxdhw59PISUjER3dA1Dzqc1Ag0dY70SlcXC4tBLJ5Z133lFLsUggFR4eXqjnzpo1S3UBjhw5Ut3+4osvsGrVKixYsACvvvrqHdvL/TExMdi5c2f2Ui/SWkVEwJ+Hb+B/vxyCTq+hT+PKmDWgMRzsivzbiMzMmdgzOOZoC2+vuggpXx/OPT8EHDgBgMgsZtXVqFEDY8eOLdS20nq0f/9+TJ48Ofs+W1tbhIaGqrFU+ZGgrHXr1qqr7vfff4e3tzcGDx6sxlvZ2fFXNVmv5QeuYfyvh6HXgP7N/PDho43VGnRkuTJP/okDOiDJMR11veqjwRNrYGvvZOpiEVk8k6UjkLFSOp1OLRack9w+depUvs+5cOECNm3ahCFDhqhxTefOncPzzz+PzMzMApeCSU9PVxeDhIQE9a9er1eX4ib7lAHzxtg3sd7z88veq3ht5THIPI2BzavgnX4NYAP5DBp/4gY/7yagy0Ti2gkIO/UrEqu0Rques1HVo2r28SDj4mfeMuu9KPs1qzxO8sZ8fHzw1VdfqRYmmeF3/fp1fPjhhwUGTjNnzsSMGTPuuP/WrVtIS0szShnj4+PVAZYWNCoZ1lrvyw5H4qPNV9X1Rxt74+W2PoiKulVir2+t9W4qdglXkbj5FRxOPAdXTUNL+2pwSLZHZFr+40Kp+PEzb5n1npiYWPoDJy8vLxX83Lx5M9f9cltm6uVHBqHL2Kac3XKSGiEiIkJ1/Tk63jndWroCZQB6zhanqlWrqm4+Dw8PGOPg2tjYqP3zRFJyrLHev9l+MTtoeqpdAF7rWVfVQUmyxno3CfmlvW8Bjv3zFs7aZqGKjSOaPjgXsRVasu5LGD/zllnvMrO/1AdOEuRIi9HGjRvRr1+/7IqR22PGjMn3OW3btsXixYvVdoaKk4ScElDlFzQZZgHKJS95vrG+6OXgGnP/xHqft/kcPvz7tKqIFzrXwPgH6pR40GTAz7uRpcUj9fcx2HV5PWJsbdGkfH3U6vsV9GX9YRMZye8aE+Bn3vLqvSj7LHLgJOOSvvvuOxXgSNqAvP2CMgapsKQlaPjw4WjevDlatmyJOXPmIDk5OXuWneSOklQD0t0mnnvuOXz22Wd4+eWX8eKLL+Ls2bN499138dJLLxX1bRCZJWmmnr3hLD7ZeFbdHtetNl7qWsvUxSIjioq9gLDwnYCdAzq2eBFe7cbLt7z80mS9E5lAkQMnCVokcOrVqxcaNGhwX79yBw4cqMYaTZ06VXW3NWnSBGvXrs0eMH7lypVcUaB0sf39999q5l6jRo1UUCXlkVl1RNYQNH3w92l8vuW8uj2pR10816mGqYtFRnQ29iyOJF5AhdYvIsS3BVz827K+iUzMRivimikyNumHH35QSTDNkYxx8vT0VIPMjDXGSVriZBA7u+pKjqXXu/yZvr/2NL7Yejtomto7CE+2q27qYll8vZuEpiFz4wzsdy+Lq+X9ULtcbTT0aghbm9z1y7o3Dda7ZdZ7UWKDIrc4yViimjVr3k/5iKiIQZOMZzIETW/2rY9hrZn41SJlpSNhxTMIu/g3Ulw80WrIX6hanl2xRKVJkcO2V155BXPnzuXivkQlFDR9tO405v/bPTe9TxCDJkuVcAPXF3TDxsvrodnYoWurVxg0EZVCRW5x2r59OzZv3ow1a9agfv362UufGCxfvrw4y0dk1UHTrPVnMG/z7aBpWp8gjGhr+u45MkKqgUM/49jG13HaJh1VbJzQvN83cKjZlVVNZAmBU9myZfHwww8bpzRElE1mz3266Zy6/kbvIIxk0GSR0laNxe7jP+OWvR0aeQSizsMLAS92zxFZTOC0cOFC45SEiLLN2XAmO+XAlF718FQpGAhOxS8qNQq7tERodvbo2PQZeHd8HbAzqwUdiKwO/0KJSpm5G85izobbQdPrD9bD0+0DTV0kKk6ZqWo80zlbDYdvHUa5On3Quu7jcKnVjfVMZCmBU7NmzVTCy3LlyqFp06Z3zd104MCB4iwfkVX5dONZzN5wRl1/7cG6GNWBQZNFubIbWStH44CzMy53fAW1ytVCI+9Gd6QaICIzD5z69u2bvWyJYXkUIipen206i4/X3w6aXu1ZF890YHJLiyHp8nbMRdLmt7HTyR7JtuUQ4u6Paj5NTF0yIjJG4DRt2rR8rxNR8a0999G620HTxB518GxHBk0WI+4qsOoV3Li4AXtcnOHs1wJdHvwMnp5VTV0yIroHHONEZGKS2NKwYO+E7nXwfCcmmLUYJ/+E9vsLOK5PxUkXV/g1GYoWHWfAwT7/RcmJqPS7p0V+Z8+ejaVLl6q15DIyMnI9HhMTU5zlI7JoC3dcxHtrTqnrr3SrjRc6M2iyGLospG/7CLtt0hFZsRYadngddWv1MnWpiOg+FXlE4owZMzBr1iy1QK+s6TJu3Dj0799frR0zffr0+y0PkdX4ec8VzPjzhLr+UtdaeLErc/dYkuiMeGxo9yziGvRD+8eWMmgistbAadGiRfj666/V0iv29vYYNGgQvvnmG0ydOhW7du0yTimJLMyKg9fw2oqj6vozHQIxNpRBk0XISAZO/IELcRew5eoWOLv7IPSB2ajoUcXUJSMiUwVOERERaNiwobru7u6uWp1E7969sWrVquIqF5HFWnUkHK8sPawmWg1r7Y/JPeveNcUHmYno88j6sj32/vE09u/5FNU9q6NTlU5wdXA1dcmIyJSBU5UqVRAeHq6u16hRA+vWrVPX9+7dm52ygIjyt+HETby85CD0GjCweVVM71OfQZO5kwj4+Aokfd0Jm1Nv4Jp7BbSs1hnNKjaDna2dqUtHRKYeHC7r1EkyzJCQELz44ot44okn8O2336qB4mPHji3u8hFZjH/O3MLziw4gS6+hb5PKeLd/Q9jasqXJrGWkACufQ/jpP7HbxRlO5Wqg80MLUNa7jqlLRkSlJXB67733sq/LAPFq1aohLCwMtWrVQp8+fYq7fEQWYdeFaDzz4z5k6PToUd8XHz/WGHYMmszbld3QVozCieRwnHB1RaU6vdGy20dwdHI3dcmIqDTncWrdurW6EFH+9l+OxVPf7UVaph5d6vrgk0FNYW/HJTbMXbqNDfak3USEhxcaSKqBhkPY7UpkBYocOP3xxx/53i+DW52dnVGzZk1Ur86V3InEsevxGLFwD5IzdGhX0wvzhzSDoz2DJrOVFAm4+yA2LRY7M28hq+VTaN/kafiW45qCRNaiyIGTrFUnQZImAyJzMNwn/7Zr1w4rV65UiwITWavTEYl44tvdSEzLQsuA8vhqWDCcHThY2Czp9cDW94Gdn+Li4EU4mBULT0dPtG43hbPmiKxMkX/6rl+/Hi1atFD/SioCuch1GSz+119/4Z9//kF0dDTGjx9vnBITmYELt5Iw5JvdiEvJROOqZfHtiOZwdeQKR2abm+nngdBtfQ/77LKw7+QS+Hv4o1NVphogskZF/iZ/+eWX8dVXX6FNmzbZ93Xt2lV10z3zzDM4fvw45syZgyeffLK4y0pkFm7EpeKJb3YjKikdQZU88MPIlijj7GDqYtG9uLQd+ONFpMRexM4ynkhoPADNW7ygcjQRkXUqcuB0/vx5eHh43HG/3HfhwgV1XWbYRUVFFU8JicyIBEvSPXcjPg2B3m744amW8HRl0GR2ZCjC9tnAxhmIsLPD7gqV4RDyHDo3fALlnDkEgciaFbmrLjg4GBMmTMCtW7ey75PrEydOVF144uzZs6hatWrxlpSolEtIy8TwBXtw4VYyKns646enQuDlzqSwZunor9A2zsAJRwdsq9sZ5R/8GKHNRjNoIqKitzhJssu+ffuqDOKG4Ojq1asIDAzE77//rm4nJSVhypQprF6yGmmZOjz9/T4cv5GACm6O+OnpEFQu62LqYtE9yqjXB3tqd0R4BX8ENX8WQeWDmGqAiO4tcKpTpw5OnDihllo5c+ZM9n3dunWDra1t9sw7ImuRqdOrjOB7LsagjJM9vn+yJQK9mQTRrOgygd1fAM2fRJw+E2HhYcho+STaV2oFXzdfU5eOiEqRe5rmIwFSjx491IXImun1mlqwd9OpSDjZ2+LbES3QwM/T1MWiokiOBpYMAq7uxqXIozhQvwc8HD3QoUoHuDm4sS6JKBfOjya6R5K3bOofx/DH4Ruwt7XBF08Eo2X18qxPcxJxTAVNurgrOFymPM57VkBAmapo6tMU9rb8eiSiO/GbgegefbzuDH7adQU2NsCsgU3Qua4P69KcHF8BrHweKVmpCPPxR1yr0Qiu0ROBnswCTkQFY+BEdA++/ucCPtt8Tl1/u18DPNS4MuvRXOh1wOZ3gG0f46akGqgeDLuWz6BzQDeUd2aLIRHdHQMnoiL6Ze8VvLP6pLo+sUcdDAnxZx2ak+RbwP7vcMrRAceCesCn2VMI8WsDJzumjiAiIwVOkgRz4cKF6t+5c+fCx8cHa9asQbVq1VC/fv172SWRWVh9NByTlx9V10d3CMRzHWuYukhURJmuFbC36yRcj7+Ieo2eQP0K9ZlqgIiMlwBz69ataNiwIXbv3o3ly5ernE3i8OHDmDZtWlF3R2Q2tp+NwstLDkKvAY+3qIpXe9blCddcnFoNnF6L+PR4bLiyAZGeFdG25Uto4NWAx5CIjBs4vfrqq3j77bfVwr6Ojo7Z93fp0gW7du0q6u6IzMLRa/EY/eM+ZOo0PNjQF+883JAnXHOQkQL8OkLNnLuy8hlsOrEEdjZ2CK0WisruHJdGRCXQVXf06FEsXrz4jvulu47r05EluhiVjBEL9yA5Q4c2NSpg9sAmsLO1MXWx6L+kxgK/DIX+0jYcdnbBuVrt4e/TEM0qtWSqASIqucCpbNmyCA8PR/XquVcHP3jwIPz8/O69JESlUGRiGoYt2I3o5AzUr+yBL4cGw8neztTFov9ybDnw11ikpMdjl7snYtu+gGb1HkWNshyTRkQl3FX3+OOPY9KkSYiIiFBdFXq9Hjt27MD48eMxbNiw+ywOUWlbtHcvrsakwr+CK74b2RJlnB1MXSz6L2snA8tGIjIzERt9ayCl0wR0avI0gyYiMk2L07vvvosXXnhBLfCr0+kQFBSk/h08eDAX9iWLWrT3mR/24WR4ArzcHfHDky3hXYbT1c2CawWcdnDA0aAe8G72pEo14GzvbOpSEZG1Bk4yIPzrr7/GG2+8gWPHjqlZdU2bNkWtWrWMU0KiEqbTaxi39BB2XYiBu5O9amnyr8A1y0r9Ir12DsjUZ2JvYCtcd5yIujV6or5XfdjaFLlhnYio+AKn7du3o127dipnk1yILG39uel/HMfqoxFwtLPFV0ODuWhvaaZpwM5PgWO/IX7IEoRFHUZqZiraNBgCP3eOuSSi4lfkn2KSdkAGhr/22ms4ceKEEYpEZDqfbjqHH3dd/nf9ucZoU9OLh6O0ykwFlj8DrH8DV28dxaZdH8EGNgj1D2XQRESlJ3C6ceMGXnnlFZUIs0GDBmjSpAk+/PBDXLt2zTglJCohi3dfwaz1Z9T16X3qo3cj5vkptSJPAgu6Q390KQ45u2BXiydQOegxdK3WFWUcy5i6dERkwYocOHl5eWHMmDFqJp0sufLYY4/h+++/R0BAgGqNIjJHa49FYMrK20upvNilJoa3CTB1kaigrrlDi4FvuyM14gi2lvXG+Q4vo0nLFxFSuRXzMxFR6V7kV7rsJJN448aN1WBxaYUiMjf7LsXgpX+XUhnUsirGdatt6iJRQbZ9DGx6C1F2tgirXAdo/QI61ngQXi7sUiWiUh44SYvTokWLsGzZMqSlpaFv376YOXNm8ZaOyMjO30rC0z/sQ0aWHqH1KuKtvly7rFRrNhxnjv2Mo9WCUaH+owip2h4u9i6mLhURWZEiB06TJ0/GkiVL1Finbt26Ye7cuSpocnV1NU4JiYzkVmK6WkolLiUTjauWxaeDmsLejlPXS53Yy0A5f5VqYH/ieVztPAG1veqhoVdDphogotIfOP3zzz+YMGECBgwYoMY7EZmjlIwsPPX97azg1cq74tvhzeHiyKVUSp393wOrxyOh+9sI8wlASmYKWlVpi6plqpq6ZERkpezvpYuOyJxl6fR4cfFBHLkWj3KuDvj+yZbwcmdW8FKXamDNRODAD7hmb4e95/+Ei/cYdPXvCg9HD1OXjoisWKECpz/++AM9e/aEg4ODun43Dz30UHGVjcgoCS6n/nEcG09FwsneFt+OaIHqXswKXqpEngJ+HQ79rVM45uSE0w37oUrTEWheqQUcbLlWIBGZQeDUr18/taivj4+Pul4QWfRX1q0jKq3mbzmv8jVJgsu5jzdFs2rlTF0kyunCFmDpMKSlJ2B3WR/cajEcjeo9hjrl67CeiMh8Aie9Xp/vdSJzsvLgdXz492l1fVrvIPRo4GvqIlFOMReBnx5BFPQIq1L/dqqBgG7wdvVmPRFRqVHkKUQ//PAD0tPT77g/IyNDPUZUGu08F4UJyw6r66PaV8eIttVNXSTKq3x1nOswFltrd4B7l6kIrf0wgyYiMv/AaeTIkYiPj7/j/sTERPUYUWlz9mYiRv+0H5k6Db0aVcLknvVMXSQyOLUaiDiKLH0WdofvxsHKdVCj83TV0sT8TERkEbPqZHCtjGXKS9aq8/T0LK5yERWLqKR0jPxuLxLTstAioBw+fqwxbG3v/PxSCUuKBFY+D5xbj8SK9RHWfQqSdekI8Q1BNY9qPBxEZP6BU9OmTVXAJJeuXbvC3v7/nyoDwi9evIgePXoYq5xERZaWqcMzP+zDtdhU+FdwxZdDm8PZgbmaTO5yGPDz40BaHK5LqoFKNeGsz0KXal3g6cQfX0RkIYGTYTbdoUOH0L17d7i7u2c/5ujoqBb5feSRR4xTSqJ7aBmdsOwIDlyJg4ezPRaMaIHybo6sR1PS64Dts4GtH0DTpeNYpSCcavoY/PxaoYVvCzjYMdUAEVlQ4DRt2jT1rwRIAwcOhLOzszHLRXRfZm84iz8P34C9rQ2+GBqMGt7/H+iTCaQlAIsHAFfCkG4D7K7VHpFNB6ORbzBTDRCRZY9xGj58uHFKQlRMVhy8hk82nlXX3+3fEG1qcGkgk3MqA7j7INrBGbtajYTOvzU6VG4NH1cfU5eMiMi4gZOMZ5o9ezaWLl2KK1euqDQEOcXExBR1l0TFZs/FGExadlRdf7ZjDQxozjXNSgUbG5zvNAGHarZGWa/aaF2pNVwduDA4EVlBOoIZM2Zg1qxZqrtO0hKMGzcO/fv3h62tLaZPn26cUhIVwtW4NDy36AAydHr0qO+Lid2Zbdqk9i0AVk9EVlY69kbsxYG4Uwis0gqdq3Zm0ERE1tPitGjRInz99dfo1auXCpQGDRqEGjVqoFGjRti1axdeeukl45SU6C7iUzMx/vdziE3JRKMqnpg9sAnTDpiKLhNYOxnY+zWSbGwQ5qBHUkBbtPRtCX8Pf5MVi4jIJC1OsmZdw4YN1XWZWWdIhtm7d2+sWrWqWApFVBQZWXo8v+gALsemo5KnM74Z1hwujkw7YBJJt4Af+qqgKdzODhtaPI6swM4q1QCDJiKyysCpSpUqCA8PV9elpWndunXq+t69e+Hk5FT8JST6j7QDb6w8hrALMXB1sMU3w4Lh48EZnyZx4yDwVSdol3fguFtZbO/4IrwbD0VowAPMz0RE1hs4Pfzww9i4caO6/uKLL+KNN95ArVq1MGzYMDz55JP3VIh58+apNAeS4iAkJAR79uwp1POWLFmiEnIackyR9fnynwv4Zd9VSDLwtx4MRL1KHqYuknU6+SfwbXekJ17DtoqBONFlAhrUfxxtKrdhfiYisu4xTu+99172dRkgXq1aNYSFhangqU+fPkUuwC+//KIGmH/xxRcqaJozZ45KsHn69Gn4+BQ8VfnSpUsYP3482rdvX+TXJMuw/sRNvL/2lLo+pVc9tK3OWVomU8YXsXa22Fm9NXTBI9C+Wif4uvmarjxERKWlxSmv1q1bq8DnXoImITP0Ro0apRYIDgoKUgGUq6srFixYcNeUCEOGDFEz/AIDA++j9GSuztxMxP+WHISmAU+0qobhrTnouMRJ5f/rYhkvbO72GpzbvYLQmg8xaCIi625x+uOPPwq9w4ceeqjQ20oOqP3792Py5MnZ90lag9DQUNWKVZA333xTtUY99dRT2LZt211fIz09XV0MEhIS1L96vV5dipvsU8bdGGPfdFtcSgZGfb8PyRk6tKpeHm/0qqfqnPVeghJuwGbls8jsPAWHk+IRZxuHGlXaoLF3Y9jZ2vHzXwL4XWMarHfLrPei7LdQgVNhxxDJeCNpDSqsqKgotX3FihVz3S+3T5263QWT1/bt2/Htt9+qNfMKY+bMmaplKq9bt24hLS0Nxqh8mWkoB1iCQCpeWXoNY1eexeWYFFTycMT0B6oiNjqK9V6C7COPoNza55GaGoVtq57H+foj0aJyCKrYVEF0VHRJFsWq8buG9W5N9EY+tyYmJhZv4FRaWk/kjQ0dOlTlkfLyKtwyGtKaJV2JOVucqlatCm9vb3h4FP9AYqkrCSBl/wycit+bf53A3iuJcHW0wzfDW6D2v4PBWe8l5OSfsPljNG5qGdjlWwN2Ic+ja7m2qOlXk5/3EsbPvGmw3i2z3ouy/m6RB4cXJwl+7OzscPPmzVz3y21f3zsHlp4/f14NCs85nsoQ1Nnb26sB5ZIiISdJkZBfmgSpeGMFNnJwjbl/a7V031V8t/Oyuj5rQGPU9yub63HWuxGlJwHr34C2bwFOOjrgeJVmqNR+IppX6YC46Dh+3k2En3nWuzWxMeK5tSj7LHLgJOOL7mbq1KmF3pejoyOCg4NVegNDd6AEQnJ7zJgxd2xft25dHD16ex0ygylTpqiWqLlz56qWJLJM+y/HYsqKY+r6y11roUeDSqYukvWIuwr88BAyYi5gj4szwmuFon67V1HPu4FqNicisiZFDpxWrFiR63ZmZiYuXryoWnyktacogZOQbrThw4ejefPmaNmypUpHkJycrGbZCckP5efnp8YqSVNagwYNcj2/bNnbrQ557yfLER6fitE/7ldr0HWvX1EFTlTCqQa8ayPMNgOZwcPQvtHw7FlzDJyIyNoUOXA6ePDgHffJuKERI0ao5JhFJbmgZKC2BFyynEuTJk2wdu3a7AHjV65cYZeXFUvL1OGZH/YjKikddX3LYNYArkFXYuvN6XWAgzMuJV3HgaaPwMN2EDpW7wY3B7eSKQMRUSlkoxXTT0bpQpOxRzIGqTSTIM/T01ONzjfW4PDIyEiVLoFjnO6PfDT/98sh/H7oBsq5OuCPMe1QtXz+SS5Z78UoNQ74dQR0zmVxqMMYXEi4iACPADTzaaZSDbDeSwd+5lnv1kRv5HNrUWKDYhscLi9mWPCXqLiWU5Ggyd7WBvOHBBcYNFExuhwG/PEiUmLOIczdE/E3WiK4Rg8EejLRLBHRPQVOn3zyyR2tArLo748//oiePXuyVqlYbDr1/8upTOsThNY1KrBmjW3fQmDVK7hpC+yu4Ae7NmPQKWggyjuXZ90TEd1r4DR79uxct6XJTPIqyADvnBnAie7VuchEvPzzIbWix6CW1fBEKy6nYvSuuXWvQzv4E05JqoHqreHTYjRCqneDk92dqTyIiKxZkQMnmUFHZCwJaZlqMHhiehZaBpTHjIfqq9wdZETLRiLz/CbscXHBjaAHEdRqHIK8WO9ERKUuASZRTnq9hnG/HMKFqGRU9nTG/CeawdGeSUSNLb7d/7Az6SLSmw5BuwZPoJI7c2QRERVb4CTru3366afYvHmzGuGedzmWAwcOFHWXRMqnm85hw8lIFSx9ObQ5vNzZTWQUyVHAlV1Avd64nHAZB7Ji4N7jfbT3awt3R3d+GomIijNweuqpp7Bu3To8+uijKmElu1GouAaDz9l4Rl1/p18DNKziyYo1VhbwRY9BH3Mehx+Zj3P2UKkGmvo0hb0tG6CJiP5Lkb8p//rrL6xevRpt27Yt6lOJ8nUxKhkvL7k9GHxoK3881pxL5xhF1Dng+z5ISQrHrnK+iE27hWaB3VGjbO71HYmIqBgDJ1n+pEyZMkV9GlG+UjKy8OyP+5GYloVg/3J4o3cQa8oYru0DFg9AZHocdvvWgE2bF9GpZh9UcGGaByKioijyyNuPP/4YkyZNwuXLt1epJ7pXkgPs1d+O4vTNRHiXccL8IRwMbhRHlwHf9cbprAT8U6kWPLpMQ7egQQyaiIhKosVJFuOVAeKBgYFwdXWFg4NDrsdjYmLupRxkhRbuuIQ/DhsygzdDRQ9nUxfJ8pzfjMzfnsJeZ2dcr9wEdUPfQf1KLWBrw9mKREQlEjgNGjQI169fx7vvvqsW4uXgcLoXey/F4N3VJ9X11x6shxYBzE5tDPF+zRBWpyPS3LzQpuM0+HkymSgRUYkGTjt37kRYWBgaN258Xy9M1utWYjpeWHQAWXoNDzWujJFtA0xdJMshI+yP/grUegBXMxOx7+Y+uLV6AV392qCMI8cmEhGVeOBUt25dpKam3vcLk3XS6TW8vOQgIhPTUcvHHTP7N2SrZXGJuQisGgf9+U04UicUZ5s9jmoe1RBcMZipBoiIikmRBzq89957eOWVV7BlyxZER0cjISEh14XobuZuOIOd56Ph6miHz59oBjcn5g4qFif+AD5vi9QLm7HVvQzOe3ijqXdjhFQKYdBERFSMinzW6tGjh/q3a9eud8yQkvFOOp2u+EpHFmXL6Uh8sumcui4tTTV92HVULF1zOz8FNkxDlC0QVq0R0HQYOtbpBy8Xr/vfPxER3V/gJEutEBXV9bhUjP3lkLr+RKtq6NvEj5V4v9KTgD/GAMdX4IyDA47UbAevVi+ilV9bONtzhiIRUakInDp27GiUgpDlysjSq8HgsSmZaOjnySSXxUWXgcwbB7DP1RXXGg9A7aYj0dC7EVMNEBGVpsDpn3/+uevjHTp0uJ/ykAWaueYkDl2Ng4ezvcrX5GRvZ+oiWYQEe3vs7PgiUtMT0CpoAKqW4VI1RESlLnDq1KnTHfflzOXEMU6U09pj4SrRpZg1oAmqlndlBd3PeKbDPwMOLrhWrQX2RuyFS9mq6Fq5DTwcPVivRESlMXCKjY3NdTszMxMHDx7EG2+8gXfeeac4y0Zm7lpsCiYuO6Kuj+4QiNCgiqYukvmKPg+smQT9ufU46uyMMx3Hoop/RzT3bQ4H29zZ+4mIqBQFTp6ennfc161bNzg6OmLcuHHYv39/cZWNzFimTo+Xfj6IhLQsNKlaFuO71zF1kcy3lWn/QmD1RKRpWdjlXgZRdR5A49r9ULtCXVOXjojI6hRbEh1ZfuX06dPFtTsyc3M2nMGBK3Eo42yPTwc1hYMd10YrMl0WsGYCsG8BomxtERbQFGg8GJ3qPMxUA0RE5hI4HTlyu+slZ/6m8PBwlRizSZMmxVk2MlPbz0Zh/pbz6vp7/RtxXNO9tjQtGQyc/RtnHRxxpOkjKF9/AFr5tYaLvUsxHzEiIjJa4CTBkQwGl4App1atWmHBggVF3R1Z4Dp0Y5ceUuf9QS2roVejSqYuknmysUFW7QewP3wXrrQcgVp1+6MRUw0QEZlf4HTx4sVct21tbeHt7Q1nZybcs3Z6vYZXfj2sgqfaFd0xtXeQqYtkflLjAJeySMxIRJh3NSQ/MB2tArqhqgdTDRARmWXg5O/vb5ySkNn7etsF/HPmFpzsbfHZ4GZwcWS+pkLLylDLpuDYclwfvhx748/C2c4ZXeo8Ak+nOydkEBGRaRR6xO6mTZsQFBSU70K+8fHxqF+/PrZt21bc5SMzIQkuP/z79uSAaX3qo3ZFrkNXaEm3gB8egn7XfBzNiMHOQ9/Cx9UHXat1ZdBERGSugdOcOXMwatQoeHh45JuiYPTo0Zg1a1Zxl4/MQFJ6lko9kKXX0KthJQxqyW6lQgs/DHzVCWlXd2GbRwWcDhmJRs2eQZvKbeBgx/xMRERmGzgdPnwYPXr0KPDxBx54gDmcrNTU34/hSkwK/Mq64N2HG+bKJE93cXgJsKAHopNuYINPABK6TEaH1uNRh/mZiIjMf4zTzZs34eBQ8C9ge3t73Lp1q7jKRWbi90PXsfzAddjaAHMebwJPV7aSFMqRpcCK0TjvYI9D/k1Rru1YtPLvClcHLklDRGQRgZOfnx+OHTuGmjVrFpjfqVIlTj23JtfjUjFlxTF1/cUutdAioLypi2Q2sur0xEHfOrjk1xA1WzyHxhWbwdaGSUKJiEq7Qn9TP/jgg2o9urS0tDseS01NxbRp09C7d+/iLh+V4tQD45ceRmJ6FppWK4sXu+QfUFOe9eb0eiRlJGFTeBiudR6PkI7T0NS3OYMmIiJLa3GaMmUKli9fjtq1a2PMmDGoU+f22mOnTp3CvHnzoNPp8PrrrxuzrFSKfLfzEsIuRMPFwQ6zBjSBPZdUubujy4DfxyA85Gnsrt4cTnZO6BLQnbPmiIgsNXCSteh27tyJ5557DpMnT87OHC4Dgbt3766CJ9mGLN+5yES8v/aUuv5ar3qo7uVm6iKVXnodsOktaNtn44SjI05E7Eblet3RslIrzpojIrL0BJiS/HL16tWIjY3FuXPnVPBUq1YtlCtXznglpFIlU6fH2F8OIz1Lj461vfFESDVTF6n0SosHfnsa6efWYbeLMyLrdEfDdpNQp0IQZx4SEVlL5nAhgVKLFi2KvzRU6n266RyOXo+Hp4sDPni0EQOAgkSdBX4ehJjY8wgr4wlds+Fo32IMKrqxVZaIyOoCJ7Le7ODzNp9T19/u1wAVPbg+Yb4y04DveuNCehQOVqiMsm3/h9b1hzDVABGRBWDgRIWSmqHDuF8OQafX8FDjyujTuDJrrgBZdvY42HIYLp1bgxpdZqBxtU6ws+W6fUREloCBExXKe2tO4kJUMip6OOGtvg1Ya3klRwOJN5BcIRBhN8KQ4FsXLRo+joByNVhXREQWhIET/adtZ2/h+7DL6vqHjzZmdvD88jP99AgitEzsfuB1ODiXRZdqXVHWuSw/XUREFoaBE/3nAr6Tlh1R14e19keH2t6ssZwO/wJtzUSc0CfjRNlKqGTjgJb+oXC0c2Q9ERFZIAZOdFcfrD2FG/FpqFbeFa/2rMvaMshKB9ZMQsb+hdjj4oxwn1qo3/1D1KvSljMNiYgsGAMnKtC+SzH4cdftLrqZ/RvC1ZEfFyX+GrB0GGLDDyLMzQ2ZdXuifYep8PX056eJiMjC8UxI+UrL1GHSb0cgCeIHNq+KtjW9WFMGW9/HxcjDOOhZAR4hz6Nj8Gi4OTB7OhGRNWDgRPn6bNM5nL+VDO8yTnjtwXqspX/p9DocajoAF5IvonqLZ9E0sCdTDRARWREGTnSHk+EJ+GLreXX9rb71OYsu5oJapDel1fMIi9iN+Ix4BD/4CQI9A/npISKyMgycKJcsnV510WXpNfSo74seDSpZdw2FHwF+fBgR6bHYE38Kdg0eQeeqnVHOmeszEhFZIwZOlMvCHZdw5Fo8PJzt8Wbf+tZdO2c3QPvtSZzSp+CYb01UrNcXIf6hcLJzMnXJiIjIRBg4UbbL0cn4eP1pdf31XvXgY61r0cmI+B1zkLlhBva4OOFGxdoI6jELQZVDmGqAiMjKMXAiRdM0TF5+FGmZerSpUQEDmle1zprJSAH+GIP4Eyuw080F6QFt0e6Bj1GJqQaIiIiBExks3XcVO89Hw9nBFu/1b2S9LSsRR3H57BoccHODe5MhaN9uCtydypi6VEREVEqwxYkQmZCGt1edVDXxSrc6qFbB1SprRa/pccjJAefbPYeAMlXQtNEw2NvyT4SIiP4fzwqEN/86gcS0LDSq4omRbQOsbzzTvm+R4lULYXZZiEuLQ3DDJxBYlqkGiIjoTgycrNz2s1H460g4bG2Adx9uCHs7W1iN9ERgxbOIPLsGu8r5wi50Ojr5d0UFlwqmLhkREZVSDJysWHqWDlN/P6auD2sdgAZ+nrAasZeAnwfhdNxZHHUrA586vRBS8yE42VvpTEIiIioUBk5W7KutF3Ah6vayKuMeqA2rcfEfZC4dhr1IxXUPL9Tt/CYaBD1mvQPiiYio0Bg4WalLUcn4dPM5dX1Kr3rwcHaAVYxn2j4b8VvfQ5iTHdLKVUebB+fBr3KwqUtGRERmgoGTleZseuP3Y8jI0qN9LS881LgyrMWVyCPY72wPN79gdH1oIcq4eZm6SEREZEYYOFkhGQy+7WwUHO1t8VbfBlbRRSWpBo7cOoKz9bqhmndNBLedBHs7K2hlIyKiYsXAycokpmXirb9OqOsvdKqJAC83WLTkKKQe+QW7KtdDTHosmlYKQc2gIaYuFRERmSkGTlZm9vqziExMR0AFV4zuaOG5ii5swa0/xmBXVjTg3x4de8+Hlwu75oiI6N6ViqQ98+bNQ0BAAJydnRESEoI9e/YUuO3XX3+N9u3bo1y5cuoSGhp61+3p/x2/EY/vdl5U19/s2wDODnaWWT26LGDDdJz++VFs1cXAw8UH3dq9zqCJiIjMP3D65ZdfMG7cOEybNg0HDhxA48aN0b17d0RGRua7/ZYtWzBo0CBs3rwZYWFhqFq1Kh544AFcv369xMtuTvR6DW+sPAa9BvRqWAkdanvDIiVFIvOn/gjbNx9HnB1RO7A72o/cBOdKjU1dMiIisgAmD5xmzZqFUaNGYeTIkQgKCsIXX3wBV1dXLFiwIN/tFy1ahOeffx5NmjRB3bp18c0330Cv12Pjxo0lXnZz8uv+qzhwJQ5ujnZ4o3cQLJFDxAEkftUeGyP3IsLJDa07vYlGDy+ErSszgRMRkQUEThkZGdi/f7/qbssukK2tui2tSYWRkpKCzMxMlC9f3oglNW8xyRmYueaUuj62W234elpmduxryMImuwzYlPFD18eWokrw06YuEhERWRiTDg6PioqCTqdDxYoVc90vt0+dun2i/y+TJk1C5cqVcwVfOaWnp6uLQUJCgvpXWqnkUtxkn5InyRj7vlfvrzmJuJRM1PEtg6GtqpWqst23m8eg9wlSqQYOZEaiXofX0TxoABycPCzrfZZSpfHzbi1Y96x3a6I38ndNUfZr1rPq3nvvPSxZskSNe5KB5fmZOXMmZsyYccf9t27dQlpamlEqPz4+Xh1gaT0ztaM3kvDLvmvq+tj2lREbHQWLoGlwO/AF7PZ/io0dJiHS1QtV7aqieqWGiI2X41r8x5ZK/+fdmrDuWe/WRG/k75rExETzCJy8vLxgZ2eHmzdv5rpfbvv6+t71uR999JEKnDZs2IBGjRoVuN3kyZPV4POcLU4yoNzb2xseHh4wxsGVhJKyf1OfSHR6DXOWnlXXHw32wwNNLST9gD4LNn/9D9FHf0aYizO05EvoFTwcWpJWKurdmpSmz7u1Yd2z3q2JsT/vBTW+lLrAydHREcHBwWpgd79+/dR9hoHeY8aMKfB5H3zwAd555x38/fffaN68+V1fw8nJSV3ykoo31he9HFxj7r+wfjtwFcdvJKCMsz0m96xn8vIUi4wUYNlInL24EUdcXVG+8VC06jQNTrZOiEyOLBX1bm1Ky+fdGrHuWe/WxMaI3zVF2afJu+qkNWj48OEqAGrZsiXmzJmD5ORkNctODBs2DH5+fqrLTbz//vuYOnUqFi9erHI/RUREqPvd3d3VhW5LSs/CB3+fVtfHdK6JCu53Bo9mJzUWmYsH4sCtQ7ji6oZa7V5Fo+ajYWtjyzE2RERUIkweOA0cOFCNN5JgSIIgSTOwdu3a7AHjV65cyRUJfv7552o23qOPPpprP5IHavr06SVe/tJq3uZzuJWYDv8KrhjRNgBmLyMZiQt7YmfSJaQ4l0GrHrNQtW5fU5eKiIisjMkDJyHdcgV1zcnA75wuXbpUQqUyX5ejk/HtttsZwqf0CoKTvflnCL+eEYc9PtXgootHl4e/g2eVEFMXiYiIrFCpCJyoeL27+iQydHq0r+WF0Ho+Zl29+owUHI8/j1Oxp+AX8gJauPvDoZwFtKAREZFZYuBkYXZdiMbfx2/CztZGZQiXwXRmSZeFtPVTsPvmPtwKGYVGPk1Qp3wdU5eKiIisHAMnCyLpB97884S6PrhlNdSuWAZmKSUG0b8OQ1jkfmg2QAedPXwYNBERUSnAwMmC/Lb/Gk6E304/IEurmKWIozi/dDAOZUSjnK0jWnX7AK71HzF1qYiIiBQGThYiJSMLH6+/nX7gpS61UN7NEeYm68hSHPj7FVy21aGWsxcaPboItr4NTV0sIiKibAycLMQ32y7iZkI6qpZ3wbA2/jA3SdtmYWfYB0i2tUWIdzNUe/QHwJULNxMRUenCwMkCSL6mL7eeV9cndK9rdukHbiTdwB43Fzg5uKBL/UHwDH0LsDWv90BERNaBgZMFmLPhDJIzdGhcxRN9GlWCudASInA8IwonY07Cr2JjtHjyHzh4VjV1sYiIiArEwMnMnYtMxJK9V9X11x6sZx7pB/Q6pG/7GLv3forIVqPRMOhR1ClXxzzKTkREVo2Bk5l7b80plYagW1BFhARWQKmXeBMxy59EWOQB6Gxs0D76BiqWr2vqUhERERUKAyczFnY+GhtORqpkl6/2NIPg48JWXPj9GRzUJaKsrSNad3gDrs2fMnWpiIiICo2Bk5nS6zW1tIoY1LIqani7o9TSZSFr7SQcPPojLjnYo4ZbZTR+5AfY+QSZumRERERFwsDJTP1++DqOXo+Hu5M9/hdaupNdJl/cgrDji5Bgb48W/t0Q0Gsu4FLW1MUiIiIqMgZOZig1Q4cP1t5OdvlcpxrwcndCaRWeFI49SIZjvT7o4tcWZZuNMHWRiIiI7hkDJzP03c5LCI9Pg19ZFzzVrjpKI+3gYpwo74cT6ZGo5FYJLR+cB0c788tmTkRElBMDJzOTkJaJL/5NdjmuW204O5SyRJFZ6chYNQ67Ty1DhFdN1O8zH/W8GzHVABERWQQGTma4tEp8aiZq+rijX1M/lCrR5xG7cjTCYo4j084e7as/AF+vhgDzMxERkYVg4GRGYpIz8O22C9mtTZKGoNS4+A8uLhuGgzYZ8LB3Qceen8Ctbm9Tl4qIiKhYMXAyI59vOaeWVmng54Ee9X1RWugO/oSD6ybioj1Q3bMGmvb9BnbedUxdLCIiomLHwMlM3ExIww9hl9X1Vx6oA9tS0tqUkhqHnXvmIsEOCK7SAYH9vwMcnE1dLCIiIqNg4GQmPt10FulZejT3L4dOtb1RGkQkR2B3+G44tH0JnaOuoFyHyYCtramLRUREZDQMnMzA1ZgULNlzeyHf8d1NvxiulpaIk0d/wnFPL/i6+qJl9R5wsiu9uaSIiIiKCwMnMzBnw1lk6TW0r+WFViZeyDcz9iJ2Lx2I8KRrCGo3GUG1HjV5IEdERFRSGDiVcuciE7Hi4LXssU2mFHd8OcI2TEJ6ZjLa2bqjUuVWTDVARERWhYFTKTd7/VnoNaBbUEU0qWqi9d00DZf+eRcH982Hu06H9mVqwP3xxUA5f9OUh4iIyEQYOJVix67HY9XRcJU/8pUHTLOQry4lFod/fxLnb+xGQGYWmtZ9FPa95wD2XD6FiIisDwOnUmzW+jPq3z6NKqOur0eJv35KZgrCDn+DuOu7EZyhR2DoO0CLp9k9R0REVouBUym1/3IsNp2KVNnBx3Yr+damm8k3sTtiN+wqBqFz8PMoX6sHUKV5iZeDiIioNGHgVEp99Pdp9e+jzaqgupdbyb2wLgunNk/FsYp14FOuOkIqhcApsFfJvT4REVEpxsCpFNpxLgphF6LhaGeLl0JrldjrZqYnYe/Sx3A98jDquVVB/ae2wYb5mYiIiLIxcCplNE3DB/+2Ng0OqQa/si4l8rrxl7YibMNkpMVdQttMoHLPKYADk1oSERHlxMCplNlwMhKHr8bB2cEWz3euYfwX1DRc2Tkb+3fNhpsuE6E6O7gPXAwEdjT+axMREZkZBk6liF6v4eN1t1ubRratDp8yxl0sV58SjSPLR+LszX3wz8xCM/9Q2Pf6GPCoZNTXJSIiMlcMnEqRv46G41REIso42WN0h0CjvlZqVirCIg8gNv0WmmboUbP9a0Dbl5lqgIiI6C4YOJUSWTo95vybt2lUh0CUdTVSgkm9DrfSYrArfJdaY65Tz89Qwc4FqBhknNcjIiKyIAycSonlB67jQlQyyrs54sl21Y3zIok3cXrp4zjqXR3eTYchpFIrONsbtzuQiIjIktiaugAEZGTpMXfjWVUVz3WsAXen4o9nMy/8g7AFHXEk7jRqn9mM9k6+DJqIiIiKiC1OpcCv+6/ielwqfMo4YWjr4l84N2Hft9i5ZRpSoUdrl8qoMnAJ4FWz2F+HiIjI0jFwKgWtTfM3n1fXn+9UA84OdsW3c10Wrq6fjH3HFsFVr0fXgFB49PsScCzBTOREREQWhIGTif124Fp2a9PjLasV2371mh5Hfx2EM9d3oGpWFoKbvwCHLm8AtuydJSIiulcMnEw8k+7zLbdbm0Z3LL7WJkk1sOvGLsT41kGTy7tQ68E5QKMBxbJvIiIia8bAyYRWHQ3HlZgUNZNucDG1NkXFnEVY7Cl1vWPI/+DV7HnAo3Kx7JuIiMjasd/GhFnC520+p64/1a46XBzvs7VJ03B27XhsXfwQyqSnINQ/FF4uXgyaiIiIihFbnExk/cmbOHMzSWUJf6LV/c2ky0y+hf3LhuBq1DHUzshEw6RE2NqXzOLARERE1oSBkwlomoZPN93O2zS8TQA8XRzueV8J1/cjbNVzSEm4hlaZQNUHPgKChxdjaYmIiMiAgZMJbDlzC8euJ8DFwe7es4RnpePatg+w99A3cMnKQFe4wmPk74Bvg+IuLhEREf2LgZMpWpv+zRL+RKtqamD4vaQaOLZhMk4fX4IqmVlo7tcODv3mA2V8jVBiIiIiMmDgVMLCzkfjwJU4ONrbYlT7wCI/Py0rDbvDd+NWQEs0ir+OOrX7AE2GADY2RikvERER/T8GTiXs0023Z9I93qIqfDyKsMBuZiqi1k/BrpptoNk5omO1rvCu+7jxCkpERER3YOBUgg5ciUXYhWjY29qohJeFFn8d55Y+jsPx51FOS0Hr7rPgwllzREREJY6BUwma/2/epoeb+sGvbOHSBWSd34gDf4zGZX0Katm6oFG9gUw1QEREZCIMnErI8Rvx2HAyErY2wLOdCtfalLTvW+zcMg3J0COkTACqPfYTUL7o46KIiIioeDBwKiGf/Tu2qXejyqjh7X73jTNScGP1WOw59yecNQ1d/EPh2f8bwN6pZApLRERE+WLgVALORSZh7fEIdX1Ml5r/ma7g+PUwnLyyCX46HVq0GAOHzlM4a46IiKgUYOBUAr7+54IsJYduQRVRu2KZArdL16WrVAORmbFo2Hk66rpUBGqGlkQRiYiIqBAYOBnZzYQ0rDh4XV1/tmMB45OyMhC96mXs8vSGrmpzdKjSAT6uPsYuGhERERURAycjW7jjEjJ0ejT3L4dg//J3bpBwAxeWDcPBmBMo6+CG1i1egqurl7GLRURERPeAgZMRJaRlYtGuy+r6s/nkbco6vRoHV7+ES/oU1LBxROOen8HOjUETERFRacXAyYh+3n0FielZqOXjji51c3S96XVIWjMRYccXIcnWFi3LBMC//0LAu44xi0NERET3iYGTkWRk6bFgx0V1/ZkOgbCVBE5Cl4XwpYOw+/oOONnYoHPQYJTt9hbgULiEmERERGQ6DJyMRNIP3ExIh08ZJ/Rt4pedauBE7GmccHZEZT3QssdcODQaYKwiEBERUTFj4GQk3++8PbbpiVb+cLSzQXpyJPbEnkZESgQatH8NdVu9CptKDY318kRERGQEDJyM4HhEMg5ejYOjnS0GNfVC7C+DsTPhPHSdJ6O9X3v4uvka42WJiIjIyGxRCsybNw8BAQFwdnZGSEgI9uzZc9ftf/31V9StW1dt37BhQ6xevRqlya+HItW/g+s5IPG3fth8bSucE24g1MadQRMREZEZM3ng9Msvv2DcuHGYNm0aDhw4gMaNG6N79+6IjLwdfOS1c+dODBo0CE899RQOHjyIfv36qcuxY8dQGtxKTMeGM7HoYrsPPSNHY3/CWfjbuqDTo7/AtWY3UxePiIiIzDlwmjVrFkaNGoWRI0ciKCgIX3zxBVxdXbFgwYJ8t587dy569OiBCRMmoF69enjrrbfQrFkzfPbZZygNft59GcNsV+Dxsl/guj4Vzd38ETx8HewC2pq6aERERGTOY5wyMjKwf/9+TJ48Ofs+W1tbhIaGIiwsLN/nyP3SQpWTtFCtXLky3+3T09PVxSAhIUH9q9fr1aW4UxCk7/0AjcuuRZYGdKrVF2UfnA29naO8YLG+FuUmx1JmLRb3MaW7Y72bDuue9W5N9Eb+ji/Kfk0aOEVFRUGn06FixYq57pfbp06dyvc5ERER+W4v9+dn5syZmDFjxh3337p1C2lpaShOG05HYykq4S2dM5o1HIGMpqMQGR1XrK9BBX/o4+Pj1R+WBN9UMljvpsO6Z71bE72Rv+MTExMLva3Fz6qT1qycLVTS4lS1alV4e3vDw8OjWF9rQAUvuHk8ieiovvDq3Iwn8BL+o7KxsVHHlYET690a8DPPercmeiN/x8tkM7MInLy8vGBnZ4ebN2/mul9u+/rmP2Vf7i/K9k5OTuqSl1R8cVe+o60t+jaujshIN6Psn+5O/qhY7yWP9W46rHvWuzWxMeJ3fFH2adIzu6OjI4KDg7Fx48ZcUaXcbt26db7Pkftzbi/Wr19f4PZERERExcXkXXXSjTZ8+HA0b94cLVu2xJw5c5CcnKxm2Ylhw4bBz89PjVUSL7/8Mjp27IiPP/4YvXr1wpIlS7Bv3z589dVXJn4nREREZOlMHjgNHDhQDdSeOnWqGuDdpEkTrF27NnsA+JUrV3I1obVp0waLFy/GlClT8Nprr6FWrVpqRl2DBg1M+C6IiIjIGthoMkTdisjgcE9PTzU6v7gHhxu6GiV5p4+PD8c4lSDWu2mw3k2Hdc96tyZ6I59bixIbcPQyERERUSExcCIiIiIqJAZORERERIXEwImIiIiokBg4ERERERUSAyciIiKiQmLgRERERFRIDJyIiIiIComBExEREVEhMXAiIiIiKiQGTkRERETmsshvSTMszSfr0hhrPZ3ExEQ4OztzrboSxHo3Dda76bDuWe/WRG/kc6shJijM8r1WFzhJxYuqVauauihERERUymIEWez3bmy0woRXFha13rhxA2XKlIGNjY1RolYJyq5evfqfKywT693c8fPOurc2/MxbZr1LKCRBU+XKlf+zRcvqWpykQqpUqWL015EDy8Cp5LHeTYP1bjqse9a7NfEw4rn1v1qaDDg4nIiIiKiQGDgRERERFRIDp2Lm5OSEadOmqX+p5LDeTYP1bjqse9a7NXEqRedWqxscTkRERHSv2OJEREREVEgMnIiIiIgKiYETERERUSExcLoH8+bNQ0BAgEr9HhISgj179tx1+19//RV169ZV2zds2BCrV6++l5e1ekWp96+//hrt27dHuXLl1CU0NPQ/jxMVz+fdYMmSJSrJbL9+/Vi1JVT3cXFxeOGFF1CpUiU1iLZ27dr8vimBep8zZw7q1KkDFxcXlaRx7NixSEtLu5eXtlr//PMP+vTpoxJQyvfGypUr//M5W7ZsQbNmzdRnvWbNmvjuu+9KpKySLZOKYMmSJZqjo6O2YMEC7fjx49qoUaO0smXLajdv3sx3+x07dmh2dnbaBx98oJ04cUKbMmWK5uDgoB09epT1bsR6Hzx4sDZv3jzt4MGD2smTJ7URI0Zonp6e2rVr11jvRqx3g4sXL2p+fn5a+/bttb59+7LOS6Du09PTtebNm2sPPvigtn37dnUMtmzZoh06dIj1b8R6X7Rokebk5KT+lTr/+++/tUqVKmljx45lvRfB6tWrtddff11bvny5TFjTVqxYcdftL1y4oLm6umrjxo1T59ZPP/1UnWvXrl2rGRsDpyJq2bKl9sILL2Tf1ul0WuXKlbWZM2fmu/2AAQO0Xr165bovJCREGz169L0cL6tV1HrPKysrSytTpoz2/fffG7GUlude6l3quk2bNto333yjDR8+nIFTCdX9559/rgUGBmoZGRn3+pJ0D/Uu23bp0iXXfXIyb9u2LevzHhUmcJo4caJWv379XPcNHDhQ6969u2Zs7KorgoyMDOzfv191++RcwkVuh4WF5fscuT/n9qJ79+4Fbk/FU+95paSkIDMzE+XLl2cVG7ne33zzTfj4+OCpp55iXZdg3f/xxx9o3bq16qqrWLEiGjRogHfffRc6nY7HwYj13qZNG/UcQ3fehQsXVPfogw8+yHo3IlOeW61urbr7ERUVpb6E5EspJ7l96tSpfJ8TERGR7/ZyPxmv3vOaNGmS6jvP+4dGxVvv27dvx7fffotDhw6xaku47uWEvWnTJgwZMkSduM+dO4fnn39e/WCQxIFknHofPHiwel67du3UQrFZWVl49tln8dprr7HKjaigc6ssBpyamqrGmxkLW5zI4r333ntqoPKKFSvUYE8yDllZfOjQoWpgvpeXF6u5hOn1etXS99VXXyE4OBgDBw7E66+/ji+++ILHwohkgLK07M2fPx8HDhzA8uXLsWrVKrz11lusdwvFFqcikJOBnZ0dbt68met+ue3r65vvc+T+omxPxVPvBh999JEKnDZs2IBGjRqxeo1Y7+fPn8elS5fUzJicJ3Nhb2+P06dPo0aNGjwGRqh7ITPpHBwc1PMM6tWrp36ZSxeUo6Mj694I9f7GG2+oHwxPP/20ui0zp5OTk/HMM8+owFW6+qj4FXRu9fDwMGprk+ARLQL54pFfchs3bsx1YpDbMrYgP3J/zu3F+vXrC9yeiqfexQcffKB+9a1duxbNmzdn1Rq53iXlxtGjR1U3neHy0EMPoXPnzuq6TNMm49S9aNu2reqeMwSr4syZMyqgYtBkvHqX8ZN5gyND8MoVzYzHpOdWow8/t8CpqjL19LvvvlNTIJ955hk1VTUiIkI9PnToUO3VV1/NlY7A3t5e++ijj9S0+GnTpjEdQQnU+3vvvaemFC9btkwLDw/PviQmJt7/h8CKFLXe8+KsupKr+ytXrqiZo2PGjNFOnz6t/fXXX5qPj4/29ttv30cprE9R612+06Xef/75ZzVFft26dVqNGjXUjGoqPPlulvQxcpHQZNasWer65cuX1eNS51L3edMRTJgwQZ1bJf0M0xGUYpIvolq1aurELFNXd+3alf1Yx44d1ckip6VLl2q1a9dW28v0yVWrVpmg1NZV7/7+/uqPL+9FvuTIePWeFwOnkq37nTt3qnQncuKX1ATvvPOOSg9Bxqv3zMxMbfr06SpYcnZ21qpWrao9//zzWmxsLKu9CDZv3pzvd7ahruVfqfu8z2nSpIk6TvJ5X7hwoVYSbOR/xm/XIiIiIjJ/HONEREREVEgMnIiIiIgKiYETERERUSExcCIiIiIqJAZORERERIXEwImIiIiokBg4ERERERUSAyciIiKiQmLgRERERFRIDJyIqETZ2Nhg5cqVpWY/RERFwcCJyMJERETgxRdfRGBgIJycnFC1alX06dPnjpXEzcX06dPRpEmTO+7/v/bOPRbL/43jV05RkrTIVKxJk1OnFWp0QocNjWUhHaZZ/NFBzVpopaHUinytP9pSLaUcNodKtRRJSpRaB2VoYimFlUn4fHddv933Hk+PPC2/9cX12p4en89939f9+Vw3u9+7DqupqQlWrVoFQ5UlS5bAjh071D5/6dKlcPr06f/rmhiGGRgtNc5hGGaIUFdXB4sWLQJDQ0NISEgAOzs7+PHjBxQUFEBYWBi8evUKhguTJ0+GkcLnz5+hpKQELl269LeXwjAjHo44McwwIjQ0lFJYDx8+BB8fH7CysgIbGxvYtWsXPHjwQBZXeM6TJ0/k61pbW2nuzp07NMZvHKPgmjNnDujp6cGyZcugubkZrl27BtbW1mBgYAD+/v7Q0dEh27GwsIATJ070WRNGizBq1B8RERG0zjFjxlCULCoqisQekpqaCgcOHICnT5/SevCDc8qpOmdnZ7KjyMePH0FbWxuKiopo/P37d9i9ezeYmZnB2LFjYeHChfJ++wP9EhISAiYmJqCrqwu2traQl5cnH8/MzCT/YmQP937s2LE+16ekpMCMGTPoWrTh6+tL85s2bYK7d+9CYmKivC98Lv2Rn58Pc+fOJRuqwHsfOnQIgoKCQF9fH8zNzSEnJ4d84OXlRXP29vZQXl4uX9PS0gLr168nf6DvUWRfvHixj/9QnMbGxspz9+/fBx0dnSEbvWSYQUEwDDMsaGlpEaNGjRKxsbG/PK+2tlbgn35lZaU89+XLF5orLCykMX7j2NHRUdy7d09UVFQIS0tL4erqKtzd3WlcVFQkJk6cKOLj42U75ubm4vjx433u5+DgIPbv3y+P0W52drY8jomJESUlJbSunJwcYWJiIg4fPkzHOjo6RHh4uLCxsRFNTU30wTllO8nJyWLatGmit7dXtnvy5Mk+c8HBwcLZ2ZnW/fbtW5GQkCBGjx4tqqurVfqpp6eH9o/3vnHjhqipqRG5ubni6tWrdLy8vFxoaGiIgwcPitevX4szZ84IPT09+kYePXokNDU1RVpamqirqyOfJSYm0rHW1lbh5OQktm7dKu+ru7u732fm6+v7y+eKfjcyMhKnTp2i/Wzbtk0YGBiIlStXisuXL9P6vL29hbW1teyPhoYG8gH+HuDekpKSaL1lZWWy3fz8fKGtrU17aW9vF9OnTxc7d+7sdx0MMxJg4cQwwwR84aGYyMrKGjThdOvWLfmcuLg4msOXrERISIjw8PD4I+GkDL7M582bJ4/xWrShjKKd5uZmoaWlRaJIAoVJREQE/VxfX0+i4P37931sLF++XOzdu1flOgoKCkgYoehQhb+/v3Bzc+szt2fPHjFr1iz6OTMzk8QLCg5VoAjdvn27GIjOzk6hr68vnj9/3u856PfAwEB5jEIM/RMVFSXPlZaW0hwe6481a9aQUFUkNDRUWFlZ0X7t7OxoPQwzkuFUHcMME/6nJQYXTO9IYJpISqcpzmH67k9IT0+nuixMC2FKKTIyEt69e/dbNiZNmgTu7u5w4cIFGtfW1kJpaSkEBATQ+NmzZ9DT00MpQbyH9MF0WU1NjUqbmMqcMmUKXaOKly9f0roVwfGbN2/oXm5ubpQyQ39t2LCB1qaY1lSX27dvg7GxMaUEf+dZIZh+U56TnheuMSYmhs4xMjIif2BqVtn3R48ehe7ubrhy5QrtAdOSDDOSYeHEMMMErKXBWpmBCsA1NDR+ElpSTZEyWCMkgbYVx9Jcb29vH9vKAq4/24gkblavXk21Q5WVlbBv3z7o6uqC3wXtZGRk0P3S0tJIEEjC4evXr6CpqQmPHz8mQSR9UPxgnZEqsK7rTxg3bhxUVFRQ3ZCpqSlER0eDg4MD1U39Dlir5OnpOeB5ys+qvznpeWHzAO4da8MKCwvJHx4eHj/5HoVlY2MjXferOiyGGSmwcGKYYQJGDfDF988//8C3b99+Oi69sDE6I7XzSygWiv8JaFvRbnt7O0V/+gOLjTEqg2Jp/vz5JP7q6+v7nIPFyBgdGQgsgu7s7ITr16+TcJKiTQgWuKMNjLZYWlr2+fTXnYcRnIaGBqiurlZ5HAvksdNNERxjhApFGqKlpQUrVqyAI0eOQFVVFQkPjCCpuy8Uobm5ubS3wQbXinYDAwNJ0GFkTHmvKKLwuJ+fH0WngoOD/zjCyDBDHRZODDOMQNGEL+MFCxZQxxemjTCqkpSUBE5OTnIkxdHREeLj4+kYpqswPTYYYOfd+fPnobi4mNJjGzdulEWEKlAoYWoI2+wxsoHrzM7O/qljDMUXirtPnz5Rd5wqsFPO29ubuvJwX9gxJoFiBoUUdp1lZWWRPew8jIuLo441Vbi6uoKLiwt1J968eZOuwY5CFGZIeHg4dZehoEDBcfbsWUhOTqbOPQQjaLgfXDeKwXPnzlHUZubMmfK+ysrKSEzhvhQjdxIYIcP03uLFi2GwQd/jvlC8or+we/DDhw99zkFB29bWRvuQuh+3bNky6GthmCHF3y6yYhhmcGlsbBRhYWFUMKyjoyPMzMyEp6enXPiNvHjxgoqnsQts9uzZ1DWmqjgci8YlsFts/Pjxfe6lXLjd1tYm/Pz8qCh66tSpIjU1dcDicCyoxu48LIDGa7G4XPE+WIzs4+MjDA0N6Vqpa01VkTl2vOG8i4vLT37p6uoS0dHRwsLCgjrFTE1Nxdq1a0VVVdUvOxU3b95M69PV1RW2trYiLy9PPp6RkUHF4GgPO/iwsF2iuLiYCsAnTJhAfra3txfp6enycSw6x649PIZrxqJ9ZSIjI0VAQIAYCFVF+cr+UW4KwL15eXmR342NjeleQUFBNCf9DmDBPe5D0QY+25SUlAHXxDDDlVH4z98WbwzDMIzqdCFGA9etW8fuYZj/CJyqYxiG+Q+C9UWYJhzK/60MwwxHOOLEMAzDMAyjJhxxYhiGYRiGURMWTgzDMAzDMGrCwolhGIZhGEZNWDgxDMMwDMOoCQsnhmEYhmEYNWHhxDAMwzAMoyYsnBiGYRiGYdSEhRPDMAzDMIyasHBiGIZhGIYB9fgXIqA4F5yM7BEAAAAASUVORK5CYII=", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# 랜덤 베이스라인 Cost Curve\n", + "rng = np.random.default_rng(RANDOM_STATE)\n", + "perm = rng.permutation(len(tau_r_test))\n", + "\n", + "tau_r_rand = np.clip(tau_r_test[perm], 0.0, None)\n", + "tau_c_rand = np.clip(tau_c_test[perm], 0.0, None)\n", + "\n", + "cum_cost_rand = np.cumsum(tau_c_rand)\n", + "cum_gain_rand = np.cumsum(tau_r_rand)\n", + "\n", + "cum_cost_rand = np.insert(cum_cost_rand, 0, 0.0)\n", + "cum_gain_rand = np.insert(cum_gain_rand, 0, 0.0)\n", + "\n", + "x_rand = cum_cost_rand / cum_cost_rand[-1]\n", + "y_rand = cum_gain_rand / cum_gain_rand[-1]\n", + "\n", + "aucc_rand = np.trapz(y_rand, x_rand)\n", + "print(\"Random ranking AUCC:\", aucc_rand)\n", + "\n", + "# 플롯\n", + "plt.figure(figsize=(6, 5))\n", + "plt.plot(x, y, label=f\"Duality R-learner (AUCC={aucc:.3f})\")\n", + "plt.plot(x_rand, y_rand, linestyle=\"--\", label=f\"Random (AUCC={aucc_rand:.3f})\")\n", + "plt.plot([0, 1], [0, 1], alpha=0.4, linewidth=1, label=\"y=x reference\")\n", + "\n", + "plt.xlabel(\"Cumulative cost / max\")\n", + "plt.ylabel(\"Cumulative gain / max\")\n", + "plt.title(\"Cost curve on Test set (τ-based)\")\n", + "plt.legend()\n", + "plt.grid(alpha=0.3)\n", + "plt.tight_layout()\n", + "plt.show()\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "965eecec", + "metadata": {}, + "outputs": [], + "source": [ + " " + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.14" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From 8ee2699f7e179d5f0f83e42557e26ddbaabb0ce4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=EC=A1=B0=ED=95=B4=EC=B0=BD?= Date: Sun, 14 Dec 2025 17:49:30 +0900 Subject: [PATCH 3/5] fix: dataset and aucc logic --- ...rning_for_effectiveness_optimization.ipynb | 1225 +++++++++-------- book/prescriptive_analytics/overview.md | 4 +- 2 files changed, 652 insertions(+), 577 deletions(-) diff --git a/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb b/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb index c5b0644..0082571 100644 --- a/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb +++ b/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb @@ -47,24 +47,69 @@ } ], "source": [ - "%pip -q install scikit-uplift" + "%pip -q install fractional-uplift" ] }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 1, "id": "9114f7da", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "

🌲 Try YDF, the successor of\n", + " TensorFlow\n", + " Decision Forests using the same algorithms but with more features and faster\n", + " training!\n", + "

\n", + "
\n", + "
\n", + " \n", + " Old code

\n", + "
\n",
+       "import tensorflow_decision_forests as tfdf\n",
+       "\n",
+       "tf_ds = tfdf.keras.pd_dataframe_to_tf_dataset(ds, label=\"l\")\n",
+       "model = tfdf.keras.RandomForestModel(label=\"l\")\n",
+       "model.fit(tf_ds)\n",
+       "
\n", + "
\n", + "
\n", + "
\n", + " \n", + " New code

\n", + "
\n",
+       "import ydf\n",
+       "\n",
+       "model = ydf.RandomForestLearner(label=\"l\").train(ds)\n",
+       "
\n", + "
\n", + "
\n", + "

(Learn more in the migration\n", + " guide)

\n" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], "source": [ "import numpy as np\n", "import pandas as pd\n", "\n", - "from sklearn.model_selection import train_test_split\n", "from sklearn.linear_model import Ridge, LogisticRegression\n", "from sklearn.metrics import r2_score, roc_auc_score\n", "\n", - "from sklift.datasets import fetch_hillstrom\n", + "import fractional_uplift as fr \n", "\n", "import matplotlib.pyplot as plt\n", "\n", @@ -80,27 +125,33 @@ "id": "57cbb979", "metadata": {}, "source": [ - "### Hillstrom E-mail Test Dataset\n", + "### CriteoWithSyntheticCostAndSpend Dataset\n", "\n", - "Kevin Hillstrom E-mail Test Dataset을 사용합니다. \n", - "이 데이터는 e-mail 마케팅 A/B/n 테스트 로그입니다.\n", "\n", - "- **Treatment**: ${T}$\n", - " - `Mens E-Mail`, `Womens E-Mail` $\\Rightarrow$ ${T = 1}$ (이메일 발송)\n", - " - `No E-Mail` $\\Rightarrow$ ${T = 0}$ (대조군)\n", + "- CriteoWithSyntheticCostAndSpend 데이터는\n", + " - treatment: 광고 노출 여부 (0/1)\n", + " - spend: 고객이 발생시킨 매출(이익)\n", + " - cost: 해당 고객에게 treatment를 줬을 때 발생한 고객별 비용\n", "\n", - "- **Gain outcome**: ${Y^r}$\n", - " - 2주간 지출 금액 `spend`\n", - " - “이메일을 보내면 spend가 얼마나 증가하는가?” 가 관심\n", + " 을 모두 포함하므로, 비용까지 고려한 처치 최적화 실험에 적합합니다. \n", "\n", - "- **Cost outcome**: ${Y^c}$\n", - " - 이메일 발송 1회당 비용을 1 단위로 단순화\n", - " - 따라서 $Y^c = T \\in \\{0,1\\}$\n" + "- 세 가지 DataFrame으로 구성\n", + " - `train_data`\n", + " - `distill_data` (여기서는 validation 역할로 사용)\n", + " - `test_data`\n", + "\n", + "- 주요 컬럼\n", + " - `treatment`: 광고/프로모션 노출 여부 (0/1)\n", + " - `spend`: 사용자가 발생시킨 매출(이익) → Gain outcome: $(Y^r)$\n", + " - `cost`: 해당 고객에게 treatment를 줄 때 들어간 비용 → Cost outcome: $(Y^c)$\n", + " - `treatment_propensity`: 실험에서 treatment에 할당될 확률\n", + " - `sample_weight`: 샘플 가중치\n", + " - `criteo.features`: feature 컬럼 이름 리스트 (문자열 리스트)" ] }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 2, "id": "b2b3d7a2", "metadata": {}, "outputs": [ @@ -108,25 +159,13 @@ "name": "stdout", "output_type": "stream", "text": [ - "data shape: (64000, 8)\n", + "Train shape: (72053, 19)\n", + "Val shape: (17774, 19)\n", + "Test shape: (20333, 19)\n", "\n", - "spend (target) describe:\n", - "count 64000.000000\n", - "mean 1.050908\n", - "std 15.036448\n", - "min 0.000000\n", - "25% 0.000000\n", - "50% 0.000000\n", - "75% 0.000000\n", - "max 499.000000\n", - "Name: spend, dtype: float64\n", + "Feature columns: ['f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9', 'f10', 'f11']\n", "\n", - "segment (treatment_raw) 분포:\n", - "segment\n", - "Womens E-Mail 0.334172\n", - "Mens E-Mail 0.332922\n", - "No E-Mail 0.332906\n", - "Name: proportion, dtype: float64\n" + "Train head:\n" ] }, { @@ -150,150 +189,234 @@ " \n", " \n", " \n", - " recency\n", - " history_segment\n", - " history\n", - " mens\n", - " womens\n", - " zip_code\n", - " newbie\n", - " channel\n", + " f0\n", + " f1\n", + " f2\n", + " f3\n", + " f4\n", + " f5\n", + " f6\n", + " f7\n", + " f8\n", + " f9\n", + " f10\n", + " f11\n", + " treatment\n", + " conversion\n", + " treatment_propensity\n", + " cost_percentage\n", + " spend\n", + " cost\n", + " sample_weight\n", " \n", " \n", " \n", " \n", - " 0\n", - " 10\n", - " 2) $100 - $200\n", - " 142.44\n", + " 44\n", + " 12.616365\n", + " 10.059654\n", + " 8.964588\n", + " 4.679882\n", + " 10.280525\n", + " 4.115453\n", + " 0.294443\n", + " 4.833815\n", + " 3.955396\n", + " 13.190056\n", + " 5.300375\n", + " -0.168679\n", " 1\n", " 0\n", - " Surburban\n", - " 0\n", - " Phone\n", + " 0.85\n", + " 0.000000\n", + " 0.000000\n", + " 0.000000\n", + " 100.0\n", " \n", " \n", - " 1\n", - " 6\n", - " 3) $200 - $350\n", - " 329.08\n", - " 1\n", + " 187\n", + " 12.616365\n", + " 10.059654\n", + " 8.904597\n", + " 4.679882\n", + " 10.280525\n", + " 4.115453\n", + " 0.294443\n", + " 4.833815\n", + " 3.955396\n", + " 13.190056\n", + " 5.300375\n", + " -0.168679\n", " 1\n", - " Rural\n", - " 1\n", - " Web\n", + " 0\n", + " 0.85\n", + " 0.000000\n", + " 0.000000\n", + " 0.000000\n", + " 100.0\n", " \n", " \n", - " 2\n", - " 7\n", - " 2) $100 - $200\n", - " 180.65\n", - " 0\n", + " 484\n", + " 22.377238\n", + " 10.059654\n", + " 8.214383\n", + " 4.679882\n", + " 10.280525\n", + " 4.115453\n", + " -2.411115\n", + " 4.833815\n", + " 3.971858\n", + " 13.190056\n", + " 5.300375\n", + " -0.168679\n", " 1\n", - " Surburban\n", - " 1\n", - " Web\n", + " 0\n", + " 0.85\n", + " 0.000000\n", + " 0.000000\n", + " 0.000000\n", + " 100.0\n", " \n", " \n", - " 3\n", - " 9\n", - " 5) $500 - $750\n", - " 675.83\n", + " 528\n", + " 12.616365\n", + " 10.059654\n", + " 8.350682\n", + " 4.679882\n", + " 10.280525\n", + " 4.115453\n", + " 0.294443\n", + " 4.833815\n", + " 3.955396\n", + " 16.226044\n", + " 5.300375\n", + " -0.168679\n", " 1\n", " 0\n", - " Rural\n", - " 1\n", - " Web\n", + " 0.85\n", + " 0.000000\n", + " 0.000000\n", + " 0.000000\n", + " 100.0\n", " \n", " \n", - " 4\n", - " 2\n", - " 1) $0 - $100\n", - " 45.34\n", + " 1108\n", + " 14.617627\n", + " 10.059654\n", + " 8.489929\n", + " 3.907662\n", + " 13.253813\n", + " 4.115453\n", + " -2.411115\n", + " 4.833815\n", + " 3.809530\n", + " 42.176324\n", + " 5.737292\n", + " -0.560340\n", " 1\n", - " 0\n", - " Urban\n", - " 0\n", - " Web\n", + " 1\n", + " 0.85\n", + " 0.090777\n", + " 36.459294\n", + " 3.309655\n", + " 1.0\n", " \n", " \n", "\n", "" ], "text/plain": [ - " recency history_segment history mens womens zip_code newbie channel\n", - "0 10 2) $100 - $200 142.44 1 0 Surburban 0 Phone\n", - "1 6 3) $200 - $350 329.08 1 1 Rural 1 Web\n", - "2 7 2) $100 - $200 180.65 0 1 Surburban 1 Web\n", - "3 9 5) $500 - $750 675.83 1 0 Rural 1 Web\n", - "4 2 1) $0 - $100 45.34 1 0 Urban 0 Web" + " f0 f1 f2 f3 f4 f5 f6 \\\n", + "44 12.616365 10.059654 8.964588 4.679882 10.280525 4.115453 0.294443 \n", + "187 12.616365 10.059654 8.904597 4.679882 10.280525 4.115453 0.294443 \n", + "484 22.377238 10.059654 8.214383 4.679882 10.280525 4.115453 -2.411115 \n", + "528 12.616365 10.059654 8.350682 4.679882 10.280525 4.115453 0.294443 \n", + "1108 14.617627 10.059654 8.489929 3.907662 13.253813 4.115453 -2.411115 \n", + "\n", + " f7 f8 f9 f10 f11 treatment \\\n", + "44 4.833815 3.955396 13.190056 5.300375 -0.168679 1 \n", + "187 4.833815 3.955396 13.190056 5.300375 -0.168679 1 \n", + "484 4.833815 3.971858 13.190056 5.300375 -0.168679 1 \n", + "528 4.833815 3.955396 16.226044 5.300375 -0.168679 1 \n", + "1108 4.833815 3.809530 42.176324 5.737292 -0.560340 1 \n", + "\n", + " conversion treatment_propensity cost_percentage spend cost \\\n", + "44 0 0.85 0.000000 0.000000 0.000000 \n", + "187 0 0.85 0.000000 0.000000 0.000000 \n", + "484 0 0.85 0.000000 0.000000 0.000000 \n", + "528 0 0.85 0.000000 0.000000 0.000000 \n", + "1108 1 0.85 0.090777 36.459294 3.309655 \n", + "\n", + " sample_weight \n", + "44 100.0 \n", + "187 100.0 \n", + "484 100.0 \n", + "528 100.0 \n", + "1108 1.0 " ] }, - "execution_count": 11, + "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "dataset = fetch_hillstrom(target_col=\"spend\", return_X_y_t=False)\n", + "criteo = fr.example_data.CriteoWithSyntheticCostAndSpend.load()\n", "\n", - "data = dataset.data.copy() # X (features, 아직 전처리 전)\n", - "y_gain = dataset.target.copy() # Y^r = spend\n", - "treatment_raw = dataset.treatment.copy() # 'Mens E-Mail', 'Womens E-Mail', 'No E-Mail'\n", + "train_df = criteo.train_data.copy()\n", + "val_df = criteo.distill_data.copy() # distill_data를 validation 데이터로 사용\n", + "test_df = criteo.test_data.copy()\n", + "features = criteo.features # feature column 리스트\n", "\n", - "print(\"data shape:\", data.shape)\n", - "print(\"\\nspend (target) describe:\")\n", - "print(y_gain.describe())\n", + "print(\"Train shape:\", train_df.shape)\n", + "print(\"Val shape:\", val_df.shape)\n", + "print(\"Test shape:\", test_df.shape)\n", + "print(\"\\nFeature columns:\", features)\n", "\n", - "print(\"\\nsegment (treatment_raw) 분포:\")\n", - "print(treatment_raw.value_counts(normalize=True))\n", - "\n", - "data.head()" + "print(\"\\nTrain head:\")\n", + "train_df.head()\n" ] }, { "cell_type": "code", - "execution_count": 12, - "id": "2ff956f2", + "execution_count": 3, + "id": "5c9e7a68", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Treatment 비율 (T=1): 0.66709375\n", - "\n", - "Y_gain (spend) 요약:\n", - "count 64000.000000\n", - "mean 1.050908\n", - "std 15.036448\n", + "treatment 비율: 0.8611161228540103\n", + "spend describe:\n", + "count 72053.000000\n", + "mean 7.638117\n", + "std 15.380174\n", "min 0.000000\n", "25% 0.000000\n", "50% 0.000000\n", "75% 0.000000\n", - "max 499.000000\n", + "max 172.747528\n", "Name: spend, dtype: float64\n", - "\n", - "Y_cost 분포:\n", - "segment\n", - "1.0 0.667094\n", - "0.0 0.332906\n", - "Name: proportion, dtype: float64\n" + "cost describe:\n", + "count 72053.000000\n", + "mean 2.558094\n", + "std 7.596816\n", + "min 0.000000\n", + "25% 0.000000\n", + "50% 0.000000\n", + "75% 0.000000\n", + "max 63.162000\n", + "Name: cost, dtype: float64\n" ] } ], "source": [ - "T = (treatment_raw != \"No E-Mail\").astype(int) # 이메일 받았으면 1, 아니면 0\n", - "\n", - "Y_gain = y_gain.astype(float) # spend (float)\n", - "Y_cost = T.astype(float) # 이메일 발송 비용 (0/1)\n", - "\n", - "print(\"Treatment 비율 (T=1):\", T.mean())\n", - "print(\"\\nY_gain (spend) 요약:\")\n", - "print(pd.Series(Y_gain).describe())\n", - "\n", - "print(\"\\nY_cost 분포:\")\n", - "print(pd.Series(Y_cost).value_counts(normalize=True).rename(\"proportion\"))\n" + "print(\"treatment 비율:\", train_df[\"treatment\"].mean())\n", + "print(\"spend describe:\")\n", + "print(train_df[\"spend\"].describe())\n", + "print(\"cost describe:\")\n", + "print(train_df[\"cost\"].describe())" ] }, { @@ -301,22 +424,25 @@ "id": "bfdbc4fd", "metadata": {}, "source": [ - "### Feature 전처리\n", - "\n", - "Hillstrom의 주요 feature 예시:\n", + "### Feature 행렬 & 타겟 정의\n", "\n", - "- `recency`, `history`, `mens`, `womens`, `newbie` 등: 숫자/0-1 변수\n", - "- `history_segment`, `zip_code`, `channel`: 범주형\n", + "- $X$: `features` 컬럼들\n", + "- $T$: `treatment` (0/1)\n", + "- $Y^r$: `spend` (gain)\n", + "- $Y^c$: `cost` (cost)\n", "\n", - "R-learner / Propensity 모델에 넣기 위해\n", + "여기서는:\n", "\n", - "- 숫자형 컬럼은 그대로 사용하고,\n", - "- 범주형 컬럼(`history_segment`, `zip_code`, `channel`)은 one-hot 인코딩으로 변환합니다.\n" + "- 데이터셋이 이미 `train / distill / test`로 나뉘어 있으므로,\n", + " - `train_df` → train\n", + " - `val_df` → validation\n", + " - `test_df` → test\n", + " 로 그대로 사용합니다.\n" ] }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 4, "id": "e286adf5", "metadata": {}, "outputs": [ @@ -324,108 +450,32 @@ "name": "stdout", "output_type": "stream", "text": [ - "원본 feature columns:\n", - "['recency', 'history_segment', 'history', 'mens', 'womens', 'zip_code', 'newbie', 'channel']\n", - "\n", - "Numeric columns:\n", - "['recency', 'history', 'mens', 'womens', 'newbie']\n", - "\n", - "Categorical columns (one-hot 대상):\n", - "['history_segment', 'zip_code', 'channel']\n", - "\n", - "전처리 후 feature shape: (64000, 15)\n", - "전처리된 feature columns:\n", - "Index(['recency', 'history', 'mens', 'womens', 'newbie',\n", - " 'history_segment_2) $100 - $200', 'history_segment_3) $200 - $350',\n", - " 'history_segment_4) $350 - $500', 'history_segment_5) $500 - $750',\n", - " 'history_segment_6) $750 - $1,000', 'history_segment_7) $1,000 +',\n", - " 'zip_code_Surburban', 'zip_code_Urban', 'channel_Phone', 'channel_Web'],\n", - " dtype='object')\n" + "X_train shape: (72053, 12)\n", + "X_val shape: (17774, 12)\n", + "X_test shape: (20333, 12)\n" ] } ], "source": [ - "print(\"원본 feature columns:\")\n", - "print(data.columns.tolist())\n", - "\n", - "# one-hot 대상 범주형 컬럼\n", - "categorical_cols = [\"history_segment\", \"zip_code\", \"channel\"]\n", - "\n", - "# 나머지는 숫자/0-1 컬럼으로 그대로 사용\n", - "numeric_cols = [c for c in data.columns if c not in categorical_cols]\n", + "X_train = train_df[features].values.astype(np.float32)\n", + "X_val = val_df[features].values.astype(np.float32)\n", + "X_test = test_df[features].values.astype(np.float32)\n", "\n", - "print(\"\\nNumeric columns:\")\n", - "print(numeric_cols)\n", - "print(\"\\nCategorical columns (one-hot 대상):\")\n", - "print(categorical_cols)\n", + "T_train = train_df[\"treatment\"].values.astype(int)\n", + "T_val = val_df[\"treatment\"].values.astype(int)\n", + "T_test = test_df[\"treatment\"].values.astype(int)\n", "\n", - "# one-hot 인코딩\n", - "X_cat = pd.get_dummies(data[categorical_cols], drop_first=True)\n", - "X_num = data[numeric_cols].reset_index(drop=True)\n", + "Yg_train = train_df[\"spend\"].values.astype(float) # gain\n", + "Yg_val = val_df[\"spend\"].values.astype(float)\n", + "Yg_test = test_df[\"spend\"].values.astype(float)\n", "\n", - "X_df = pd.concat([X_num, X_cat], axis=1)\n", + "Yc_train = train_df[\"cost\"].values.astype(float) # cost\n", + "Yc_val = val_df[\"cost\"].values.astype(float)\n", + "Yc_test = test_df[\"cost\"].values.astype(float)\n", "\n", - "print(\"\\n전처리 후 feature shape:\", X_df.shape)\n", - "print(\"전처리된 feature columns:\")\n", - "print(X_df.columns)\n", - "\n", - "# numpy array로 변환\n", - "X = X_df.values.astype(np.float32)\n" - ] - }, - { - "cell_type": "markdown", - "id": "eeb08865", - "metadata": {}, - "source": [ - "데이터 세트는 각각 60%, 20%, 20%의 비율로 학습, 검증 및 테스트 세트의 3부분으로 나뉩니다." - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "id": "3d964183", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Train shape: (38400, 15)\n", - "Val shape: (12800, 15)\n", - "Test shape: (12800, 15)\n", - "\n", - "Treatment 비율 (Train/Val/Test):\n", - "Train: 0.6670833333333334\n", - "Val : 0.667109375\n", - "Test : 0.667109375\n" - ] - } - ], - "source": [ - "# Train / Validation / Test 분할\n", - "X_train_val, X_test, T_train_val, T_test, Yg_train_val, Yg_test, Yc_train_val, Yc_test = train_test_split(\n", - " X, T, Y_gain, Y_cost,\n", - " test_size=0.2,\n", - " random_state=RANDOM_STATE,\n", - " stratify=T,\n", - ")\n", - "\n", - "X_train, X_val, T_train, T_val, Yg_train, Yg_val, Yc_train, Yc_val = train_test_split(\n", - " X_train_val, T_train_val, Yg_train_val, Yc_train_val,\n", - " test_size=0.25, # 0.25 * 0.8 = 0.2\n", - " random_state=RANDOM_STATE,\n", - " stratify=T_train_val,\n", - ")\n", - "\n", - "print(\"Train shape:\", X_train.shape)\n", - "print(\"Val shape:\", X_val.shape)\n", - "print(\"Test shape:\", X_test.shape)\n", - "\n", - "print(\"\\nTreatment 비율 (Train/Val/Test):\")\n", - "print(\"Train:\", T_train.mean())\n", - "print(\"Val :\", T_val.mean())\n", - "print(\"Test :\", T_test.mean())\n" + "print(\"X_train shape:\", X_train.shape)\n", + "print(\"X_val shape:\", X_val.shape)\n", + "print(\"X_test shape:\", X_test.shape)" ] }, { @@ -438,8 +488,8 @@ "Duality R-learner는 다음 두 단계를 결합한 방식입니다.\n", "\n", "1. R-learner로 Gain/Cost CATE 추정\n", - " - $\\tau_r(x)$: gain uplift (예: spend uplift)\n", - " - $\\tau_c(x)$: cost uplift (예: 이메일 발송 비용 증가량)\n", + " - $\\tau_r(x)$: gain uplift\n", + " - $\\tau_c(x)$: cost uplift\n", "\n", "2. 예산 제약(budget constraint)을 듀얼 형태로 최적화\n", " - 라그랑지 승수 $\\lambda$ 를 학습하여 최적 정책을 찾습니다.\n", @@ -453,8 +503,8 @@ " \\quad \\text{s.t.} \\quad\n", " \\sum_i \\tau_c(x^{(i)}) z_i \\le B\n", " $$\n", - " \n", - "- $z_i = 1$ 이면 고객 $i$ 에게 이메일 발송, $z_i = 0$ 이면 미발송\n", + "\n", + "- $z_i = 1$ 이면 고객 $i$ 에게 프로모션/광고를 집행, $z_i = 0$ 이면 미집행\n", "\n", "Duality R-learner 핵심 단계:\n", "\n", @@ -462,7 +512,7 @@ "2. Gain / Cost R-learner: $\\tau_r(x)$, $\\tau_c(x)$ 추정 \n", "3. Duality: $\\lambda$ 를 gradient ascent 로 최적화 \n", "4. 정책 생성: $s(x) = \\tau_r(x) - \\lambda^* \\tau_c(x)$\n", - "5. Cost Curve / AUCC 로 정책 성능 평가 " + "5. Cost Curve / AUCC 로 정책 성능 평가 \n" ] }, { @@ -480,21 +530,26 @@ "$$\n", "\n", "여기서 \n", - "- $m^*(X) = \\mathbb{E}[Y \\mid X]$: outcome 평균 \n", + "\n", + "- $m^*(X) = \\mathbb{E}[Y \\mid X]$: outcome 평균 모델 \n", "- $e^*(X) = \\mathbb{P}(T=1 \\mid X)$: propensity score \n", "\n", - "Gain outcome에 대한 nuisance 모델은 다음과 같이 구성합니다.\n", + "Criteo 셋에서는 gain과 cost가 모두 연속값이므로 \n", + "각각 독립적인 회귀모델을 쓰는 것이 자연스럽습니다.\n", "\n", - "- $m_r(x)$: Ridge 회귀 \n", - "- $e(x)$: Logistic 회귀 \n", + "- Gain outcome $Y^r = \\texttt{spend}$\n", + " - $m_r(x) = \\mathbb{E}[Y^r\\mid X=x]$: Ridge 회귀\n", "\n", - "Cost outcome은 $Y^c = T$ 이므로\n", - "$m_c(x) = e(x)$" + "- Cost outcome $Y^c = \\texttt{cost}$\n", + " - $m_c(x) = \\mathbb{E}[Y^c\\mid X=x]$: Ridge 회귀\n", + "\n", + "- Treatment model\n", + " - $e(x) = \\mathbb{P}(T=1\\mid X=x)$: Logistic 회귀" ] }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 7, "id": "3e718594", "metadata": {}, "outputs": [ @@ -503,18 +558,18 @@ "output_type": "stream", "text": [ "== m_r(x) 성능 (R^2: spend 회귀) ==\n", - "Train R^2: 0.0011321804124860835\n", - "Val R^2: 0.0006200906912602333\n", + "Train R^2: 0.5991714831471346\n", + "Val R^2: 0.6034863154274288\n", "\n", "예측값 분포 (Val):\n", - "count 12800.000000\n", - "mean 0.998539\n", - "std 0.499257\n", - "min -4.757108\n", - "25% 0.690851\n", - "50% 0.961950\n", - "75% 1.234435\n", - "max 4.417754\n", + "count 17774.000000\n", + "mean 7.529438\n", + "std 11.912387\n", + "min -5.214262\n", + "25% -0.278099\n", + "50% 1.335855\n", + "75% 12.749807\n", + "max 81.783020\n", "dtype: float64\n" ] } @@ -527,20 +582,66 @@ "Yg_pred_train = m_r.predict(X_train)\n", "Yg_pred_val = m_r.predict(X_val)\n", "\n", - "r2_train = r2_score(Yg_train, Yg_pred_train)\n", - "r2_val = r2_score(Yg_val, Yg_pred_val)\n", + "r2_train_mr = r2_score(Yg_train, Yg_pred_train)\n", + "r2_val_mr = r2_score(Yg_val, Yg_pred_val)\n", "\n", "print(\"== m_r(x) 성능 (R^2: spend 회귀) ==\")\n", - "print(\"Train R^2:\", r2_train)\n", - "print(\"Val R^2:\", r2_val)\n", + "print(\"Train R^2:\", r2_train_mr)\n", + "print(\"Val R^2:\", r2_val_mr)\n", "\n", "print(\"\\n예측값 분포 (Val):\")\n", - "print(pd.Series(Yg_pred_val).describe())\n" + "print(pd.Series(Yg_pred_val).describe())" ] }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 8, + "id": "083c0aa5", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "== m_c(x) 성능 (R^2: cost 회귀) ==\n", + "Train R^2: 0.2841092845226817\n", + "Val R^2: 0.29156262859724946\n", + "\n", + "예측값 분포 (Val):\n", + "count 17774.000000\n", + "mean 2.524266\n", + "std 4.070278\n", + "min -24.999222\n", + "25% -0.114549\n", + "50% 0.894167\n", + "75% 4.298206\n", + "max 23.168346\n", + "dtype: float64\n" + ] + } + ], + "source": [ + "# Cost outcome 평균 모델 m_c(x): Ridge 회귀 (cost 전용 모델)\n", + "m_c = Ridge(alpha=1.0, random_state=RANDOM_STATE)\n", + "m_c.fit(X_train, Yc_train)\n", + "\n", + "Yc_pred_train = m_c.predict(X_train)\n", + "Yc_pred_val = m_c.predict(X_val)\n", + "\n", + "r2_train_mc = r2_score(Yc_train, Yc_pred_train)\n", + "r2_val_mc = r2_score(Yc_val, Yc_pred_val)\n", + "\n", + "print(\"== m_c(x) 성능 (R^2: cost 회귀) ==\")\n", + "print(\"Train R^2:\", r2_train_mc)\n", + "print(\"Val R^2:\", r2_val_mc)\n", + "\n", + "print(\"\\n예측값 분포 (Val):\")\n", + "print(pd.Series(Yc_pred_val).describe())" + ] + }, + { + "cell_type": "code", + "execution_count": 9, "id": "1aa8d52b", "metadata": {}, "outputs": [ @@ -549,12 +650,12 @@ "output_type": "stream", "text": [ "== e(x) 성능 (AUC: treatment 모델) ==\n", - "Train AUC: 0.5117265124259399\n", - "Val AUC: 0.49799591470904553\n", + "Train AUC: 0.5432668234321524\n", + "Val AUC: 0.5470811168626702\n", "\n", "Propensity e(x) range:\n", - "Train: 0.633101626124968 → 0.8026040280233658\n", - "Val : 0.6331449838328645 → 0.8183494704982611\n" + "Train: 0.8259820462034405 → 0.9497856673846461\n", + "Val : 0.8258405409868536 → 0.9441271638409029\n" ] } ], @@ -583,18 +684,7 @@ "\n", "print(\"\\nPropensity e(x) range:\")\n", "print(\"Train:\", e_train.min(), \"→\", e_train.max())\n", - "print(\"Val :\", e_val.min(), \"→\", e_val.max())\n" - ] - }, - { - "cell_type": "markdown", - "id": "a2b17bd8", - "metadata": {}, - "source": [ - "여기서 얻은 ${e(x)}$ 는 이후에\n", - "\n", - "- Gain R-learner에서 ${T - e(x)}$ 항을 만들 때,\n", - "- Cost R-learner에서 ${m_c(x)}$ 로도 재사용합니다. " + "print(\"Val :\", e_val.min(), \"→\", e_val.max())" ] }, { @@ -602,45 +692,44 @@ "id": "06b5558e", "metadata": {}, "source": [ - "### 2. Gain R-learner: $\\tau_r(x)$\n", + "### 2. R-learner: Gain / Cost CATE 추정\n", "\n", "Gain outcome $Y^r$ 에 대해 R-learner 구조는 다음과 같습니다.\n", "\n", "$$\n", - "Y^r - m_r(X)\n", - "= (T - e(X))\\,\\tau_r(X) + \\epsilon\n", + "Y - m(X) = (T - e(X))\\,\\tau(X) + \\epsilon\n", "$$\n", "\n", - "선형 모델 $\\tau_r(x) = w_r^\\top x$ 를 사용하면 학습 절차는 다음과 같습니다.\n", - "\n", - "1. 잔차 계산 \n", + "선형 모델 $\\tau(x) = w^\\top x$ 를 쓰면:\n", "\n", + "1. **잔차 계산**\n", " $$\n", - " r^Y = Y^r - \\hat m_r(X), \\quad r^T = T - \\hat e(X)\n", + " r^Y = Y - \\hat m(X), \\quad r^T = T - \\hat e(X)\n", " $$\n", - "\n", - "2. 행별 스케일링 \n", - "\n", + "2. **행 단위 스케일링**\n", " $$\n", " Z = X \\odot r^T\n", " $$\n", - "\n", - "3. 회귀 \n", - "\n", + "3. **회귀**\n", + " $$\n", + " r^Y \\approx Z w\n", + " $$\n", + "4. **최종 CATE**\n", " $$\n", - " r^Y \\approx Z w_r\n", + " \\hat\\tau(x) = w^\\top x\n", " $$\n", "\n", - "4. 최종 CATE \n", + "이를 공통 함수로 구현하고,\n", "\n", - " $$\n", - " \\hat\\tau_r(x) = w_r^\\top x\n", - " $$" + "- Gain R-learner: $Y = Y^r$, $m = m_r$\n", + "- Cost R-learner: $Y = Y^c$, $m = m_c$\n", + "\n", + "로 각각 학습합니다." ] }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 10, "id": "28eb5e7c", "metadata": {}, "outputs": [], @@ -658,7 +747,7 @@ " 선형 τ(x) = w^T x 를 R-learner 방식으로 학습.\n", " - X_tr, X_val: feature 행렬\n", " - T_tr, T_val: treatment (0/1)\n", - " - Y_tr, Y_val: outcome\n", + " - Y_tr, Y_val: outcome (gain or cost)\n", " - m_tr, m_val: m(x) = E[Y|X] 예측값\n", " - e_tr, e_val: e(x) = P(T=1|X) 예측값\n", " \"\"\"\n", @@ -694,12 +783,12 @@ " print(\"\\nVal τ_hat summary:\")\n", " print(pd.Series(tau_val).describe())\n", "\n", - " return tau_model, tau_tr, tau_val\n" + " return tau_model, tau_tr, tau_val" ] }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 11, "id": "03fc2e68", "metadata": {}, "outputs": [ @@ -709,31 +798,31 @@ "text": [ "== Gain R-learner τ_r(x) 요약 ==\n", "Train τ_hat summary:\n", - "count 38400.000000\n", - "mean 0.562616\n", - "std 0.454530\n", - "min -2.669276\n", - "25% 0.265676\n", - "50% 0.528041\n", - "75% 0.837171\n", - "max 1.976344\n", + "count 72053.000000\n", + "mean 0.869965\n", + "std 4.075175\n", + "min -16.366707\n", + "25% -0.214938\n", + "50% 0.119879\n", + "75% 1.240913\n", + "max 74.068790\n", "dtype: float64\n", "\n", "Val τ_hat summary:\n", - "count 12800.000000\n", - "mean 0.567048\n", - "std 0.453703\n", - "min -3.313805\n", - "25% 0.272715\n", - "50% 0.530470\n", - "75% 0.836859\n", - "max 1.944813\n", + "count 17774.000000\n", + "mean 0.847950\n", + "std 4.047337\n", + "min -13.205720\n", + "25% -0.215324\n", + "50% 0.116316\n", + "75% 1.209327\n", + "max 66.496148\n", "dtype: float64\n" ] } ], "source": [ - "# m_r(x) 예측값\n", + "# Gain R-learner: τ_r(x)\n", "m_r_train = m_r.predict(X_train)\n", "m_r_val = m_r.predict(X_val)\n", "\n", @@ -750,37 +839,13 @@ " e_val=e_val,\n", " alpha=1.0,\n", " name=\"Gain R-learner τ_r(x)\",\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "id": "e8ae0ccd", - "metadata": {}, - "source": [ - "### 3. Cost R-learner: $\\tau_c(x)$\n", - "\n", - "Cost outcome은 $Y^c = T$ 이므로 \n", - "nuisance model은 이미\n", - "\n", - "$$m_c(x) = e(x)$$\n", - "\n", - "입니다.\n", - "\n", - "Cost R-learner 식은\n", - "\n", - "$$\n", - "Y^c - m_c(X)\n", - "= (T - e(X))\\,\\tau_c(X)\n", - "$$\n", - "\n", - "Gain과 동일한 R-learner 구조로 $\\tau_c(x)$ 를 학습합니다." + ")" ] }, { "cell_type": "code", - "execution_count": 21, - "id": "f40867a2", + "execution_count": 12, + "id": "554cb0c4", "metadata": {}, "outputs": [ { @@ -789,33 +854,33 @@ "text": [ "== Cost R-learner τ_c(x) 요약 ==\n", "Train τ_hat summary:\n", - "count 38400.000000\n", - "mean 0.974710\n", - "std 0.156975\n", - "min 0.429515\n", - "25% 0.894193\n", - "50% 0.978889\n", - "75% 1.046973\n", - "max 2.098633\n", + "count 72053.000000\n", + "mean 2.873521\n", + "std 4.356113\n", + "min -18.526381\n", + "25% -0.005015\n", + "50% 0.903932\n", + "75% 4.806126\n", + "max 29.188446\n", "dtype: float64\n", "\n", "Val τ_hat summary:\n", - "count 12800.000000\n", - "mean 0.974875\n", - "std 0.157403\n", - "min 0.424263\n", - "25% 0.894096\n", - "50% 0.979144\n", - "75% 1.046372\n", - "max 2.265868\n", + "count 17774.000000\n", + "mean 2.838168\n", + "std 4.375784\n", + "min -28.228684\n", + "25% -0.024353\n", + "50% 0.862495\n", + "75% 4.697582\n", + "max 26.297773\n", "dtype: float64\n" ] } ], "source": [ - "# Cost outcome 평균 m_c(x)는 e(x)를 그대로 사용\n", - "m_c_train = e_train\n", - "m_c_val = e_val\n", + "# Cost R-learner: τ_c(x)\n", + "m_c_train = m_c.predict(X_train)\n", + "m_c_val = m_c.predict(X_val)\n", "\n", "tau_c_model, tau_c_train, tau_c_val = fit_r_learner_linear(\n", " X_tr=X_train,\n", @@ -830,13 +895,13 @@ " e_val=e_val,\n", " alpha=1.0,\n", " name=\"Cost R-learner τ_c(x)\",\n", - ")\n" + ")" ] }, { "cell_type": "code", - "execution_count": 22, - "id": "48f3ae13", + "execution_count": 13, + "id": "01798a5e", "metadata": {}, "outputs": [ { @@ -845,70 +910,70 @@ "text": [ "== τ_r(x) 요약 ==\n", "[Train]\n", - "count 38400.000000\n", - "mean 0.562616\n", - "std 0.454530\n", - "min -2.669276\n", - "25% 0.265676\n", - "50% 0.528041\n", - "75% 0.837171\n", - "max 1.976344\n", + "count 72053.000000\n", + "mean 0.869965\n", + "std 4.075175\n", + "min -16.366707\n", + "25% -0.214938\n", + "50% 0.119879\n", + "75% 1.240913\n", + "max 74.068790\n", "dtype: float64\n", "\n", "[Val]\n", - "count 12800.000000\n", - "mean 0.567048\n", - "std 0.453703\n", - "min -3.313805\n", - "25% 0.272715\n", - "50% 0.530470\n", - "75% 0.836859\n", - "max 1.944813\n", + "count 17774.000000\n", + "mean 0.847950\n", + "std 4.047337\n", + "min -13.205720\n", + "25% -0.215324\n", + "50% 0.116316\n", + "75% 1.209327\n", + "max 66.496148\n", "dtype: float64\n", "\n", "[Test]\n", - "count 12800.000000\n", - "mean 0.568442\n", - "std 0.450854\n", - "min -2.486865\n", - "25% 0.268170\n", - "50% 0.534959\n", - "75% 0.841292\n", - "max 1.970332\n", + "count 20333.000000\n", + "mean 2.168109\n", + "std 7.207469\n", + "min -13.727579\n", + "25% -1.233063\n", + "50% 0.928363\n", + "75% 3.340883\n", + "max 76.512638\n", "dtype: float64\n", "\n", "== τ_c(x) 요약 ==\n", "[Train]\n", - "count 38400.000000\n", - "mean 0.974710\n", - "std 0.156975\n", - "min 0.429515\n", - "25% 0.894193\n", - "50% 0.978889\n", - "75% 1.046973\n", - "max 2.098633\n", + "count 72053.000000\n", + "mean 2.873521\n", + "std 4.356113\n", + "min -18.526381\n", + "25% -0.005015\n", + "50% 0.903932\n", + "75% 4.806126\n", + "max 29.188446\n", "dtype: float64\n", "\n", "[Val]\n", - "count 12800.000000\n", - "mean 0.974875\n", - "std 0.157403\n", - "min 0.424263\n", - "25% 0.894096\n", - "50% 0.979144\n", - "75% 1.046372\n", - "max 2.265868\n", + "count 17774.000000\n", + "mean 2.838168\n", + "std 4.375784\n", + "min -28.228684\n", + "25% -0.024353\n", + "50% 0.862495\n", + "75% 4.697582\n", + "max 26.297773\n", "dtype: float64\n", "\n", "[Test]\n", - "count 12800.000000\n", - "mean 0.975139\n", - "std 0.155813\n", - "min 0.436810\n", - "25% 0.895776\n", - "50% 0.978849\n", - "75% 1.045682\n", - "max 1.803177\n", + "count 20333.000000\n", + "mean 8.090442\n", + "std 4.941765\n", + "min -17.154167\n", + "25% 4.801457\n", + "50% 7.960261\n", + "75% 11.408073\n", + "max 26.761067\n", "dtype: float64\n" ] } @@ -940,63 +1005,57 @@ "id": "e6e387a2", "metadata": {}, "source": [ - "### 4. Duality: 예산 제약 하에서 라그랑지안 기반 $\\lambda$ 최적화\n", + "### 3. Duality: 예산 제약 하에서 $\\lambda$ 최적화\n", "\n", - "우리가 풀고자 하는 문제는 다음과 같습니다.\n", + "목표는 다음과 같습니다.\n", "\n", "$$\n", "\\begin{aligned}\n", - "\\max_{z_i \\in \\{0,1\\}} &\\quad \\sum_i \\tau_r(x^{(i)}) z_i \\\\\n", - "\\text{s.t.} &\\quad \\sum_i \\tau_c(x^{(i)}) z_i \\le B.\n", + "\\max_{z_i \\in \\{0,1\\}}\\quad & \\sum_i \\tau_r(x^{(i)}) z_i \\\\\n", + "\\text{s.t.}\\quad & \\sum_i \\tau_c(x^{(i)}) z_i \\le B\n", "\\end{aligned}\n", "$$\n", "\n", - "여기서 $z_i = 1$ 은 고객 $i$를 타겟팅하는 경우이며, $B$는 전체 예산입니다. \n", - "이 제약을 다루기 위해 라그랑지 승수 $\\lambda \\ge 0$ 를 도입하면 라그랑지안은\n", + "- $z_i = 1$: 고객 $i$ 타깃 (프로모션 발송)\n", + "- $B$: 사용할 수 있는 총 비용 예산\n", + "\n", + "이를 위해 라그랑지 승수 $\\lambda \\ge 0$ 를 도입합니다.\n", "\n", "$$\n", - "L(z,\\lambda)\n", + "L(z, \\lambda)\n", "= -\\sum_i \\tau_r(x^{(i)}) z_i\n", - "+ \\lambda\\left(\\sum_i \\tau_c(x^{(i)}) z_i - B\\right)\n", + " + \\lambda\\left(\\sum_i \\tau_c(x^{(i)}) z_i - B\\right)\n", "$$\n", "\n", - "으로 표현됩니다.\n", - "\n", - "고정된 $\\lambda$ 아래에서 고객 $i$의 효율성 점수는 다음과 같습니다.\n", + "고정된 $\\lambda$ 에 대해:\n", "\n", "$$\n", - "s_i(\\lambda) = \\tau_r(x^{(i)}) - \\lambda\\, \\tau_c(x^{(i)}).\n", + "s_i(\\lambda) = \\tau_r(x^{(i)}) - \\lambda\\, \\tau_c(x^{(i)})\n", "$$\n", "\n", - "점수가 양수이면 타겟팅하는 것이 유리하므로 \n", - "$s_i(\\lambda) \\ge 0$ 이면 $z_i = 1$, 음수이면 $z_i = 0$ 을 선택합니다. \n", - "즉, $\\lambda$가 주어지면 단순히 $s_i(\\lambda)$가 양수인 고객만 선택하면 됩니다.\n", + "- $s_i(\\lambda) \\ge 0$ 이면 $z_i = 1$ (타깃)\n", + "- $s_i(\\lambda) < 0$ 이면 $z_i = 0$ (비타깃)\n", "\n", - "듀얼 목적함수의 기울기는\n", + "듀얼 목적함수 기울기는\n", "\n", "$$\n", "\\frac{\\partial g}{\\partial \\lambda}\n", - "\\approx \\sum_i z_i \\tau_c(x^{(i)}) - B\n", + "\\approx \\underbrace{\\sum_i z_i\\,\\tau_c^+(x^{(i)})}_{\\text{cost\\_used}} - B\n", "$$\n", "\n", - "으로 근사할 수 있고, 이에 따른 gradient ascent 업데이트는\n", + "이며, gradient ascent 업데이트는\n", "\n", "$$\n", - "\\lambda \\leftarrow \\bigl[\\lambda + \\eta(\\text{cost\\_used} - B)\\bigr]_+\n", + "\\lambda \\leftarrow [\\lambda + \\eta(\\text{cost\\_used} - B)]_+\n", "$$\n", "\n", - "로 진행됩니다. 여기서 $[\\cdot]_+$ 는 $\\lambda$가 음수가 되지 않도록 하는 projection입니다.\n", - "\n", - "예산을 초과하면 $(\\text{cost\\_used} > B)$ $\\lambda$는 증가하여 비용 효과를 더 강하게 억제하고, \n", - "예산보다 적게 사용하면 $\\lambda$는 감소하여 더 많은 고객이 선택될 수 있도록 조정됩니다.\n", - "\n", - "Train 데이터에서 양의 Cost CATE 합을 기반으로 예산 $B$를 설정하고, \n", - "위 규칙을 반복 적용하여 최종 $\\lambda^*$와 정책을 학습합니다." + "- 예산 초과($\\text{cost\\_used} > B$) → $\\lambda$ 증가 → cost가 큰 고객 penalize\n", + "- 예산 미만($\\text{cost\\_used} < B$) → $\\lambda$ 감소 → 더 많은 고객 선택 허용" ] }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 14, "id": "5fe3e686", "metadata": {}, "outputs": [], @@ -1025,7 +1084,7 @@ "\n", " for it in range(n_iter + 1):\n", " # effectiveness score\n", - " s = tau_r - lam * tau_c\n", + " s = tau_r - lam * tau_c_pos\n", "\n", " # z_i: 선택 여부 (s_i >= 0 이면 선택)\n", " z = (s >= 0).astype(float)\n", @@ -1050,12 +1109,12 @@ " print(\"\\n최종 λ*:\", lam)\n", " print(\"총 양의 cost effect 합:\", total_pos_cost)\n", " print(f\"예산 B (fraction={budget_fraction}):\", B)\n", - " return lam, B\n" + " return lam, B" ] }, { "cell_type": "code", - "execution_count": 39, + "execution_count": 16, "id": "233a6a45", "metadata": {}, "outputs": [ @@ -1063,29 +1122,30 @@ "name": "stdout", "output_type": "stream", "text": [ - "[iter 000] λ=0.228487, cost_used=34077.3618, gain_used=22363.6067, grad=22848.7019, selected=0.907\n", - "[iter 020] λ=0.784009, cost_used=11251.8179, gain_used=12501.7237, grad=23.1579, selected=0.298\n", - "[iter 040] λ=0.784586, cost_used=11228.2803, gain_used=12483.2661, grad=-0.3797, selected=0.297\n", - "[iter 060] λ=0.784587, cost_used=11228.2803, gain_used=12483.2661, grad=-0.3797, selected=0.297\n", - "[iter 080] λ=0.784587, cost_used=11228.2803, gain_used=12483.2661, grad=-0.3797, selected=0.297\n", - "[iter 100] λ=0.784587, cost_used=11228.2803, gain_used=12483.2661, grad=-0.3797, selected=0.297\n", - "[iter 120] λ=0.784588, cost_used=11228.2803, gain_used=12483.2661, grad=-0.3797, selected=0.297\n", - "[iter 140] λ=0.784588, cost_used=11229.1281, gain_used=12483.9314, grad=0.4682, selected=0.297\n", - "[iter 160] λ=0.784588, cost_used=11229.1281, gain_used=12483.9314, grad=0.4682, selected=0.297\n", - "[iter 180] λ=0.784589, cost_used=11229.1281, gain_used=12483.9314, grad=0.4682, selected=0.297\n", - "[iter 200] λ=0.784589, cost_used=11229.1281, gain_used=12483.9314, grad=0.4682, selected=0.297\n", + "[iter 000] λ=0.873750, cost_used=153674.7514, gain_used=95695.1631, grad=87375.0116, selected=0.571\n", + "[iter 020] λ=0.228848, cost_used=27172.5443, gain_used=61222.1379, grad=-39127.1955, selected=0.150\n", + "[iter 040] λ=0.229153, cost_used=27165.0076, gain_used=61217.4623, grad=-39134.7322, selected=0.150\n", + "[iter 060] λ=0.229200, cost_used=27164.1299, gain_used=61216.9177, grad=-39135.6099, selected=0.150\n", + "[iter 080] λ=0.229252, cost_used=27149.2112, gain_used=61207.6592, grad=-39150.5286, selected=0.150\n", + "[iter 100] λ=0.229233, cost_used=27149.2112, gain_used=61207.6592, grad=-39150.5286, selected=0.150\n", + "[iter 120] λ=0.229137, cost_used=27165.0076, gain_used=61217.4623, grad=-39134.7322, selected=0.150\n", + "[iter 140] λ=0.229207, cost_used=27164.1299, gain_used=61216.9177, grad=-39135.6099, selected=0.150\n", + "[iter 160] λ=0.229252, cost_used=27149.2112, gain_used=61207.6592, grad=-39150.5286, selected=0.150\n", + "[iter 180] λ=0.229143, cost_used=27165.0076, gain_used=61217.4623, grad=-39134.7322, selected=0.150\n", + "[iter 200] λ=0.229180, cost_used=27164.1299, gain_used=61216.9177, grad=-39135.6099, selected=0.150\n", "\n", - "최종 λ*: 0.7845891317539325\n", - "총 양의 cost effect 합: 37428.86642372066\n", - "예산 B (fraction=0.3): 11228.659927116198\n" + "최종 λ*: 0.22917953894026566\n", + "총 양의 cost effect 합: 220999.1326870586\n", + "예산 B (fraction=0.3): 66299.73980611758\n" ] } ], "source": [ - "lambda_star, B = duality_learn_lambda(\n", + "# Train 데이터에서 λ* 학습\n", + "lambda_star, B_train = duality_learn_lambda(\n", " tau_r=tau_r_train,\n", " tau_c=tau_c_train,\n", - " budget_fraction=0.3,\n", + " budget_fraction=0.3, # 전체 양의 cost uplift 중 30%를 예산으로\n", " lr=1e-5,\n", " n_iter=200,\n", " verbose_every=20,\n", @@ -1094,16 +1154,16 @@ }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 18, "id": "a54ac8cf", "metadata": {}, "outputs": [], "source": [ "def selection_summary(tau_r, tau_c, lam, name=\"\"):\n", " tau_r = np.asarray(tau_r).astype(float)\n", - " tau_c = np.asarray(tau_c).astype(float)\n", + " tau_c_pos = np.clip(tau_c, a_min=0.0, a_max=None)\n", "\n", - " s = tau_r - lam * tau_c\n", + " s = tau_r - lam * tau_c_pos\n", " z = (s >= 0).astype(float)\n", "\n", " gain_pos = np.clip(tau_r, 0.0, None)\n", @@ -1128,12 +1188,12 @@ " \"gain_used\": gain_used,\n", " \"cost_used\": cost_used,\n", " \"gain_per_cost\": ratio,\n", - " }\n" + " }" ] }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 19, "id": "6947da6a", "metadata": {}, "outputs": [ @@ -1143,32 +1203,32 @@ "text": [ "\n", "== Selection summary (Train) ==\n", - "λ = 0.784589\n", - "선택 비율: 0.297 (11420 / 38400)\n", - "총 gain (∑ τ_r^+ z): 12483.2661\n", - "총 cost (∑ τ_c^+ z): 11228.2803\n", - "gain / cost 비율: 1.1118\n", + "λ = 0.229180\n", + "선택 비율: 0.448 (32281 / 72053)\n", + "총 gain (∑ τ_r^+ z): 89986.7776\n", + "총 cost (∑ τ_c^+ z): 105798.3648\n", + "gain / cost 비율: 0.8505\n", "\n", "== Selection summary (Val) ==\n", - "λ = 0.784589\n", - "선택 비율: 0.293 (3753 / 12800)\n", - "총 gain (∑ τ_r^+ z): 4124.4445\n", - "총 cost (∑ τ_c^+ z): 3685.4466\n", - "gain / cost 비율: 1.1191\n", + "λ = 0.229180\n", + "선택 비율: 0.443 (7877 / 17774)\n", + "총 gain (∑ τ_r^+ z): 21768.6841\n", + "총 cost (∑ τ_c^+ z): 25771.2861\n", + "gain / cost 비율: 0.8447\n", "\n", "== Selection summary (Test) ==\n", - "λ = 0.784589\n", - "선택 비율: 0.302 (3862 / 12800)\n", - "총 gain (∑ τ_r^+ z): 4210.7541\n", - "총 cost (∑ τ_c^+ z): 3794.4139\n", - "gain / cost 비율: 1.1097\n" + "λ = 0.229180\n", + "선택 비율: 0.410 (8345 / 20333)\n", + "총 gain (∑ τ_r^+ z): 60981.3120\n", + "총 cost (∑ τ_c^+ z): 55160.2068\n", + "gain / cost 비율: 1.1055\n" ] } ], "source": [ "_ = selection_summary(tau_r_train, tau_c_train, lambda_star, name=\"Train\")\n", "_ = selection_summary(tau_r_val, tau_c_val, lambda_star, name=\"Val\")\n", - "_ = selection_summary(tau_r_test, tau_c_test, lambda_star, name=\"Test\")\n" + "_ = selection_summary(tau_r_test, tau_c_test, lambda_star, name=\"Test\")" ] }, { @@ -1176,98 +1236,142 @@ "id": "6ec8b309", "metadata": {}, "source": [ - "### 5. Cost Curve & AUCC\n", + "### 4. Cost Curve & AUCC (Test set 평가)\n", + "\n", + "Test 셋에서 정책의 성능을 Incremental Cost 대비 Incremental Gain 곡선으로 평가합니다.\n", + "\n", + "1. **Effectiveness score로 정렬** \n", "\n", - "Cost Curve 와 그 면적(AUCC, Area Under Cost Curve)로 비용 대비 uplift 모델을 평가합니다.\n", + " Duality R-learner의 점수는\n", + " $$\n", + " s(x)=\\tau_r(x)-\\lambda^*\\tau_c(x)\n", + " $$\n", + " 로 정의하며, 이를 기준으로 샘플을 내림차순으로 정렬합니다.\n", "\n", - "Test 셋에서:\n", + "2. **상위 \\(k\\)명(prefix)에서 ATE 추정** \n", "\n", - "1. Duality 점수 ${s(x) = \\tau_r(x) - \\lambda^* \\tau_c(x)}$ 기준으로 내림차순 정렬\n", - "2. 정렬된 순서대로\n", - " - ${\\tau_r^+(x) = \\max(\\tau_r(x), 0)}$\n", - " - ${\\tau_c^+(x) = \\max(\\tau_c(x), 0)}$\n", - " 의 누적합 계산\n", - "3. 누적 cost/gain 을 각각 최종값으로 나누어 ${[0,1]}$ 범위로 정규화\n", - "4. $(0,0)$ 에서 $(1,1)$ 까지 이어지는 곡선을 Cost Curve 로 사용\n", - "5. 수치 적분으로 AUCC 계산:\n", + " 정렬된 상위 \\(k\\)개 집단에서 관측 결과 \\(Y\\)를 사용해 gain과 cost에 대한 ATE를 계산합니다:\n", " $$\n", - " \\text{AUCC} = \\int_0^1 \\text{gain}(x)\\,dx\n", + " \\widehat{ATE}_g(k)=\\mathbb{E}[Y_g\\mid T=1]-\\mathbb{E}[Y_g\\mid T=0],\\quad\n", + " \\widehat{ATE}_c(k)=\\mathbb{E}[Y_c\\mid T=1]-\\mathbb{E}[Y_c\\mid T=0].\n", " $$\n", "\n", - "비교를 위해 랜덤 ranking 의 Cost Curve 와 AUCC 도 함께 계산합니다.\n", + "3. **총 증분 gain/cost 계산** \n", + " 상위 \\(k\\) 집단에서 실제 처치된 샘플 수 \\(n_t(k)\\)를 곱해,\n", + " $$\n", + " \\Delta G(k)=n_t(k)\\cdot \\widehat{ATE}_g(k),\\quad\n", + " \\Delta C(k)=n_t(k)\\cdot \\widehat{ATE}_c(k)\n", + " $$\n", + " 를 각 점으로 사용합니다.\n", "\n", - "- AUCC ${\\approx 0.5}$: 랜덤에 가까운 정책\n", - "- AUCC ${>} 0.5$: 효율적인 고객부터 잘 고르는 정책\n" + "4. **정규화 및 Cost Curve 구성** \n", + " \\((0,0)\\)을 포함한 $(\\Delta C(k), \\Delta G(k))$를 정규화하여\n", + " $$\n", + " x(k)=\\frac{\\Delta C(k)}{C_{\\text{norm}}},\\quad\n", + " y(k)=\\frac{\\Delta G(k)}{G_{\\text{norm}}}\n", + " $$\n", + " 로 변환하고, 이를 이은 곡선을 Cost Curve로 정의합니다. \n", + " 정규화 기준은 전체 집단을 기본으로 사용하되, 전원 처리 시 증분 gain이 0 이하인 경우에는\n", + " **양수 구간의 최대값(max-positive)**을 사용하여 비교 가능하게 합니다.\n", + "\n", + "5. **AUCC 계산** \n", + " Cost Curve 아래 면적을 수치 적분으로 계산합니다:\n", + " $$\n", + " \\text{AUCC}=\\int_0^1 y(x)\\,dx.\n", + " $$\n" ] }, { "cell_type": "code", - "execution_count": 27, - "id": "56c9cc3e", + "execution_count": 33, + "id": "6039a560", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "== Test set Cost Curve (τ 기반) ==\n", - "max_cost: 12481.778185597159\n", - "max_gain: 7511.766716461641\n", - "Normalized AUCC: 0.6946670819574676\n" - ] - } - ], + "outputs": [], "source": [ - "# Duality R-learner 기반 effectiveness score (Test set)\n", - "s_test = tau_r_test - lambda_star * tau_c_test\n", - "\n", - "# score 기준 내림차순 정렬\n", - "order = np.argsort(-s_test)\n", - "tau_r_sorted = np.clip(tau_r_test[order], 0.0, None) # gain은 양수 부분만\n", - "tau_c_sorted = np.clip(tau_c_test[order], 0.0, None) # cost도 양수 부분만\n", - "\n", - "# 누적 cost / gain\n", - "cum_cost = np.cumsum(tau_c_sorted)\n", - "cum_gain = np.cumsum(tau_r_sorted)\n", - "\n", - "# 0 지점 포함\n", - "cum_cost = np.insert(cum_cost, 0, 0.0)\n", - "cum_gain = np.insert(cum_gain, 0, 0.0)\n", - "\n", - "# 정규화\n", - "max_cost = cum_cost[-1]\n", - "max_gain = cum_gain[-1]\n", - "\n", - "x = cum_cost / max_cost\n", - "y = cum_gain / max_gain\n", - "\n", - "# AUCC 계산\n", - "aucc = np.trapz(y, x)\n", - "\n", - "print(\"== Test set Cost Curve (τ 기반) ==\")\n", - "print(\"max_cost:\", max_cost)\n", - "print(\"max_gain:\", max_gain)\n", - "print(\"Normalized AUCC:\", aucc)\n" + "def cost_curve_aucc(scores, Yg, Yc, T, n_points=80):\n", + " \"\"\"\n", + " Paper-style Y-based Cost Curve:\n", + " - sort by score desc\n", + " - for each prefix top-k:\n", + " ATE_gain = mean(Yg|T=1) - mean(Yg|T=0)\n", + " ATE_cost = mean(Yc|T=1) - mean(Yc|T=0)\n", + " ΔGain(k) = n_treat * ATE_gain\n", + " ΔCost(k) = n_treat * ATE_cost\n", + " - normalize (rightmost if possible else max-positive)\n", + " - AUCC = ∫ y dx\n", + " \"\"\"\n", + " scores = np.asarray(scores, float)\n", + " Yg = np.asarray(Yg, float)\n", + " Yc = np.asarray(Yc, float)\n", + " T = np.asarray(T, int)\n", + "\n", + " order = np.argsort(-scores)\n", + " Yg, Yc, T = Yg[order], Yc[order], T[order]\n", + "\n", + " N = len(T)\n", + " ks = np.linspace(1, N, n_points, dtype=int)\n", + "\n", + " inc_g, inc_c = [0.0], [0.0] # include (0,0)\n", + " for k in ks:\n", + " T_k, Yg_k, Yc_k = T[:k], Yg[:k], Yc[:k]\n", + " mt, mc = (T_k == 1), (T_k == 0)\n", + "\n", + " if mt.sum() == 0 or mc.sum() == 0:\n", + " inc_g.append(0.0); inc_c.append(0.0); continue\n", + "\n", + " ate_g = Yg_k[mt].mean() - Yg_k[mc].mean()\n", + " ate_c = Yc_k[mt].mean() - Yc_k[mc].mean()\n", + " n_t = mt.sum()\n", + "\n", + " inc_g.append(ate_g * n_t)\n", + " inc_c.append(ate_c * n_t)\n", + "\n", + " inc_g = np.maximum(np.asarray(inc_g, float), 0.0)\n", + " inc_c = np.asarray(inc_c, float)\n", + "\n", + " max_g, max_c = inc_g[-1], inc_c[-1]\n", + " if max_g <= 0 or max_c <= 0:\n", + " max_g = inc_g[inc_g > 0].max() if np.any(inc_g > 0) else 1.0\n", + " max_c = inc_c[inc_c > 0].max() if np.any(inc_c > 0) else 1.0\n", + "\n", + " x = inc_c / max_c\n", + " y = inc_g / max_g\n", + "\n", + " si = np.argsort(x)\n", + " aucc = np.trapz(y[si], x[si])\n", + " return x, y, aucc\n", + "\n", + "def plot_cost_curve(x, y, aucc, title=\"Cost Curve (Paper-style, Y-based)\", label=\"Model\"):\n", + " plt.figure(figsize=(7, 6))\n", + " plt.plot(x, y, label=f\"{label} (AUCC={aucc:.3f})\")\n", + " plt.plot([0, 1], [0, 1], alpha=0.35, linewidth=1, label=\"y=x benchmark\")\n", + " plt.xlabel(\"Incremental cost (normalized)\")\n", + " plt.ylabel(\"Incremental gain (normalized)\")\n", + " plt.title(title)\n", + " plt.grid(alpha=0.3)\n", + " plt.legend()\n", + " plt.tight_layout()\n", + " plt.show()" ] }, { "cell_type": "code", - "execution_count": 28, - "id": "3b6badd9", + "execution_count": 34, + "id": "2b41ea6a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Random ranking AUCC: 0.5006872435398204\n" + "Duality AUCC: 0.6649279978825946\n" ] }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ - "
" + "
" ] }, "metadata": {}, @@ -1275,49 +1379,20 @@ } ], "source": [ - "# 랜덤 베이스라인 Cost Curve\n", - "rng = np.random.default_rng(RANDOM_STATE)\n", - "perm = rng.permutation(len(tau_r_test))\n", - "\n", - "tau_r_rand = np.clip(tau_r_test[perm], 0.0, None)\n", - "tau_c_rand = np.clip(tau_c_test[perm], 0.0, None)\n", - "\n", - "cum_cost_rand = np.cumsum(tau_c_rand)\n", - "cum_gain_rand = np.cumsum(tau_r_rand)\n", - "\n", - "cum_cost_rand = np.insert(cum_cost_rand, 0, 0.0)\n", - "cum_gain_rand = np.insert(cum_gain_rand, 0, 0.0)\n", - "\n", - "x_rand = cum_cost_rand / cum_cost_rand[-1]\n", - "y_rand = cum_gain_rand / cum_gain_rand[-1]\n", - "\n", - "aucc_rand = np.trapz(y_rand, x_rand)\n", - "print(\"Random ranking AUCC:\", aucc_rand)\n", - "\n", - "# 플롯\n", - "plt.figure(figsize=(6, 5))\n", - "plt.plot(x, y, label=f\"Duality R-learner (AUCC={aucc:.3f})\")\n", - "plt.plot(x_rand, y_rand, linestyle=\"--\", label=f\"Random (AUCC={aucc_rand:.3f})\")\n", - "plt.plot([0, 1], [0, 1], alpha=0.4, linewidth=1, label=\"y=x reference\")\n", - "\n", - "plt.xlabel(\"Cumulative cost / max\")\n", - "plt.ylabel(\"Cumulative gain / max\")\n", - "plt.title(\"Cost curve on Test set (τ-based)\")\n", - "plt.legend()\n", - "plt.grid(alpha=0.3)\n", - "plt.tight_layout()\n", - "plt.show()\n" + "scores_duality = tau_r_test - lambda_star * tau_c_test\n", + "x, y, aucc = cost_curve_aucc(scores_duality, Yg_test, Yc_test, T_test, n_points=80)\n", + "\n", + "print(\"Duality AUCC:\", aucc)\n", + "plot_cost_curve(x, y, aucc, title=\"Cost Curve on Test set\", label=\"Duality\")" ] }, { "cell_type": "code", "execution_count": null, - "id": "965eecec", + "id": "d36a48d4", "metadata": {}, "outputs": [], - "source": [ - " " - ] + "source": [] } ], "metadata": { diff --git a/book/prescriptive_analytics/overview.md b/book/prescriptive_analytics/overview.md index 8bcc007..ad8685a 100644 --- a/book/prescriptive_analytics/overview.md +++ b/book/prescriptive_analytics/overview.md @@ -1,5 +1,5 @@ # Prescriptive Analytics - Prescriptive Analytics는 데이터를 활용해 최적의 의사결정을 도출하는 분석 방식입니다. -- 접근 방식은 크게 **Prediction + Optimization**, **Causal Inference + Optimization** 으로 나눌 수 있습니다. -- 이 섹션에서는 **Causal Inference + Optimization** 에 집중하여, 개입의 인과효과(CATE)를 기반으로 **가장 효율적인 정책·전략을 선택하는 방법**을 다룹니다. +- 접근 방식은 크게 Prediction + Optimization, Causal Inference + Optimization 으로 나눌 수 있습니다. +- 이 섹션에서는 Causal Inference + Optimization 에 집중하여, 개입의 인과효과(CATE)를 기반으로 가장 효율적인 정책·전략을 선택하는 방법을 다룹니다. \ No newline at end of file From 43c7906bd6bb2b18c172bddfc22e6fa340b5a93f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=EC=A1=B0=ED=95=B4=EC=B0=BD?= Date: Sat, 20 Dec 2025 02:21:55 +0900 Subject: [PATCH 4/5] fix: train/val/test split and stabilize duality AUCC evaluation --- ...rning_for_effectiveness_optimization.ipynb | 897 ++++++------------ 1 file changed, 314 insertions(+), 583 deletions(-) diff --git a/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb b/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb index 0082571..c7f6f34 100644 --- a/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb +++ b/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb @@ -52,62 +52,17 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 503, "id": "9114f7da", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "\n", - "

🌲 Try YDF, the successor of\n", - " TensorFlow\n", - " Decision Forests using the same algorithms but with more features and faster\n", - " training!\n", - "

\n", - "
\n", - "
\n", - " \n", - " Old code

\n", - "
\n",
-       "import tensorflow_decision_forests as tfdf\n",
-       "\n",
-       "tf_ds = tfdf.keras.pd_dataframe_to_tf_dataset(ds, label=\"l\")\n",
-       "model = tfdf.keras.RandomForestModel(label=\"l\")\n",
-       "model.fit(tf_ds)\n",
-       "
\n", - "
\n", - "
\n", - "
\n", - " \n", - " New code

\n", - "
\n",
-       "import ydf\n",
-       "\n",
-       "model = ydf.RandomForestLearner(label=\"l\").train(ds)\n",
-       "
\n", - "
\n", - "
\n", - "

(Learn more in the migration\n", - " guide)

\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "\n", - "from sklearn.linear_model import Ridge, LogisticRegression\n", + "from sklearn.linear_model import Ridge\n", "from sklearn.metrics import r2_score, roc_auc_score\n", + "from sklearn.model_selection import train_test_split\n", "\n", "import fractional_uplift as fr \n", "\n", @@ -135,314 +90,52 @@ "\n", " 을 모두 포함하므로, 비용까지 고려한 처치 최적화 실험에 적합합니다. \n", "\n", - "- 세 가지 DataFrame으로 구성\n", - " - `train_data`\n", - " - `distill_data` (여기서는 validation 역할로 사용)\n", - " - `test_data`\n", - "\n", - "- 주요 컬럼\n", + "- 주요 컬럼:\n", " - `treatment`: 광고/프로모션 노출 여부 (0/1)\n", " - `spend`: 사용자가 발생시킨 매출(이익) → Gain outcome: $(Y^r)$\n", " - `cost`: 해당 고객에게 treatment를 줄 때 들어간 비용 → Cost outcome: $(Y^c)$\n", " - `treatment_propensity`: 실험에서 treatment에 할당될 확률\n", " - `sample_weight`: 샘플 가중치\n", - " - `criteo.features`: feature 컬럼 이름 리스트 (문자열 리스트)" + " - `criteo.features`: feature 컬럼 이름 리스트 (문자열 리스트)\n", + "\n", + "- Train/Val/Test split:\n", + " \n", + " `train_data`를 train / validation / test 로 분리하여 사용합니다.\n", + " " ] }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 638, "id": "b2b3d7a2", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Train shape: (72053, 19)\n", - "Val shape: (17774, 19)\n", - "Test shape: (20333, 19)\n", - "\n", - "Feature columns: ['f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9', 'f10', 'f11']\n", - "\n", - "Train head:\n" - ] - }, - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
f0f1f2f3f4f5f6f7f8f9f10f11treatmentconversiontreatment_propensitycost_percentagespendcostsample_weight
4412.61636510.0596548.9645884.67988210.2805254.1154530.2944434.8338153.95539613.1900565.300375-0.168679100.850.0000000.0000000.000000100.0
18712.61636510.0596548.9045974.67988210.2805254.1154530.2944434.8338153.95539613.1900565.300375-0.168679100.850.0000000.0000000.000000100.0
48422.37723810.0596548.2143834.67988210.2805254.115453-2.4111154.8338153.97185813.1900565.300375-0.168679100.850.0000000.0000000.000000100.0
52812.61636510.0596548.3506824.67988210.2805254.1154530.2944434.8338153.95539616.2260445.300375-0.168679100.850.0000000.0000000.000000100.0
110814.61762710.0596548.4899293.90766213.2538134.115453-2.4111154.8338153.80953042.1763245.737292-0.560340110.850.09077736.4592943.3096551.0
\n", - "
" - ], - "text/plain": [ - " f0 f1 f2 f3 f4 f5 f6 \\\n", - "44 12.616365 10.059654 8.964588 4.679882 10.280525 4.115453 0.294443 \n", - "187 12.616365 10.059654 8.904597 4.679882 10.280525 4.115453 0.294443 \n", - "484 22.377238 10.059654 8.214383 4.679882 10.280525 4.115453 -2.411115 \n", - "528 12.616365 10.059654 8.350682 4.679882 10.280525 4.115453 0.294443 \n", - "1108 14.617627 10.059654 8.489929 3.907662 13.253813 4.115453 -2.411115 \n", - "\n", - " f7 f8 f9 f10 f11 treatment \\\n", - "44 4.833815 3.955396 13.190056 5.300375 -0.168679 1 \n", - "187 4.833815 3.955396 13.190056 5.300375 -0.168679 1 \n", - "484 4.833815 3.971858 13.190056 5.300375 -0.168679 1 \n", - "528 4.833815 3.955396 16.226044 5.300375 -0.168679 1 \n", - "1108 4.833815 3.809530 42.176324 5.737292 -0.560340 1 \n", - "\n", - " conversion treatment_propensity cost_percentage spend cost \\\n", - "44 0 0.85 0.000000 0.000000 0.000000 \n", - "187 0 0.85 0.000000 0.000000 0.000000 \n", - "484 0 0.85 0.000000 0.000000 0.000000 \n", - "528 0 0.85 0.000000 0.000000 0.000000 \n", - "1108 1 0.85 0.090777 36.459294 3.309655 \n", - "\n", - " sample_weight \n", - "44 100.0 \n", - "187 100.0 \n", - "484 100.0 \n", - "528 100.0 \n", - "1108 1.0 " - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "criteo = fr.example_data.CriteoWithSyntheticCostAndSpend.load()\n", "\n", - "train_df = criteo.train_data.copy()\n", - "val_df = criteo.distill_data.copy() # distill_data를 validation 데이터로 사용\n", - "test_df = criteo.test_data.copy()\n", - "features = criteo.features # feature column 리스트\n", - "\n", - "print(\"Train shape:\", train_df.shape)\n", - "print(\"Val shape:\", val_df.shape)\n", - "print(\"Test shape:\", test_df.shape)\n", - "print(\"\\nFeature columns:\", features)\n", + "df_all = criteo.train_data.copy()\n", + "features = criteo.features\n", "\n", - "print(\"\\nTrain head:\")\n", - "train_df.head()\n" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "5c9e7a68", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "treatment 비율: 0.8611161228540103\n", - "spend describe:\n", - "count 72053.000000\n", - "mean 7.638117\n", - "std 15.380174\n", - "min 0.000000\n", - "25% 0.000000\n", - "50% 0.000000\n", - "75% 0.000000\n", - "max 172.747528\n", - "Name: spend, dtype: float64\n", - "cost describe:\n", - "count 72053.000000\n", - "mean 2.558094\n", - "std 7.596816\n", - "min 0.000000\n", - "25% 0.000000\n", - "50% 0.000000\n", - "75% 0.000000\n", - "max 63.162000\n", - "Name: cost, dtype: float64\n" - ] - } - ], - "source": [ - "print(\"treatment 비율:\", train_df[\"treatment\"].mean())\n", - "print(\"spend describe:\")\n", - "print(train_df[\"spend\"].describe())\n", - "print(\"cost describe:\")\n", - "print(train_df[\"cost\"].describe())" - ] - }, - { - "cell_type": "markdown", - "id": "bfdbc4fd", - "metadata": {}, - "source": [ - "### Feature 행렬 & 타겟 정의\n", - "\n", - "- $X$: `features` 컬럼들\n", - "- $T$: `treatment` (0/1)\n", - "- $Y^r$: `spend` (gain)\n", - "- $Y^c$: `cost` (cost)\n", - "\n", - "여기서는:\n", + "# 1) train vs temp(=val+test)\n", + "train_df, temp_df = train_test_split(\n", + " df_all,\n", + " test_size=0.4, # val 0.2 + test 0.2\n", + " random_state=RANDOM_STATE,\n", + " stratify=df_all[\"treatment\"],\n", + ")\n", "\n", - "- 데이터셋이 이미 `train / distill / test`로 나뉘어 있으므로,\n", - " - `train_df` → train\n", - " - `val_df` → validation\n", - " - `test_df` → test\n", - " 로 그대로 사용합니다.\n" + "# 2) temp를 val vs test\n", + "val_df, test_df = train_test_split(\n", + " temp_df,\n", + " test_size=0.5, # temp의 절반 = 0.2\n", + " random_state=RANDOM_STATE,\n", + " stratify=temp_df[\"treatment\"],\n", + ")" ] }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 652, "id": "e286adf5", "metadata": {}, "outputs": [ @@ -450,9 +143,9 @@ "name": "stdout", "output_type": "stream", "text": [ - "X_train shape: (72053, 12)\n", - "X_val shape: (17774, 12)\n", - "X_test shape: (20333, 12)\n" + "X_train shape: (43231, 12)\n", + "X_val shape: (14411, 12)\n", + "X_test shape: (14411, 12)\n" ] } ], @@ -473,6 +166,10 @@ "Yc_val = val_df[\"cost\"].values.astype(float)\n", "Yc_test = test_df[\"cost\"].values.astype(float)\n", "\n", + "W_train = train_df[\"sample_weight\"].values.astype(float)\n", + "W_val = val_df[\"sample_weight\"].values.astype(float)\n", + "W_test = test_df[\"sample_weight\"].values.astype(float)\n", + "\n", "print(\"X_train shape:\", X_train.shape)\n", "print(\"X_val shape:\", X_val.shape)\n", "print(\"X_test shape:\", X_test.shape)" @@ -533,23 +230,18 @@ "\n", "- $m^*(X) = \\mathbb{E}[Y \\mid X]$: outcome 평균 모델 \n", "- $e^*(X) = \\mathbb{P}(T=1 \\mid X)$: propensity score \n", - "\n", - "Criteo 셋에서는 gain과 cost가 모두 연속값이므로 \n", - "각각 독립적인 회귀모델을 쓰는 것이 자연스럽습니다.\n", + " > propensity score는 treatment_propensity 컬럼을 그대로 사용합니다.\n", "\n", "- Gain outcome $Y^r = \\texttt{spend}$\n", " - $m_r(x) = \\mathbb{E}[Y^r\\mid X=x]$: Ridge 회귀\n", "\n", "- Cost outcome $Y^c = \\texttt{cost}$\n", - " - $m_c(x) = \\mathbb{E}[Y^c\\mid X=x]$: Ridge 회귀\n", - "\n", - "- Treatment model\n", - " - $e(x) = \\mathbb{P}(T=1\\mid X=x)$: Logistic 회귀" + " - $m_c(x) = \\mathbb{E}[Y^c\\mid X=x]$: Ridge 회귀" ] }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 620, "id": "3e718594", "metadata": {}, "outputs": [ @@ -558,18 +250,18 @@ "output_type": "stream", "text": [ "== m_r(x) 성능 (R^2: spend 회귀) ==\n", - "Train R^2: 0.5991714831471346\n", - "Val R^2: 0.6034863154274288\n", + "Train R^2: 0.5937760649833947\n", + "Val R^2: 0.6185973626734869\n", "\n", "예측값 분포 (Val):\n", - "count 17774.000000\n", - "mean 7.529438\n", - "std 11.912387\n", - "min -5.214262\n", - "25% -0.278099\n", - "50% 1.335855\n", - "75% 12.749807\n", - "max 81.783020\n", + "count 14411.000000\n", + "mean 7.721704\n", + "std 11.980173\n", + "min -5.330875\n", + "25% -0.231392\n", + "50% 1.539356\n", + "75% 12.941192\n", + "max 80.358582\n", "dtype: float64\n" ] } @@ -595,7 +287,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 621, "id": "083c0aa5", "metadata": {}, "outputs": [ @@ -604,18 +296,18 @@ "output_type": "stream", "text": [ "== m_c(x) 성능 (R^2: cost 회귀) ==\n", - "Train R^2: 0.2841092845226817\n", - "Val R^2: 0.29156262859724946\n", + "Train R^2: 0.2921035090257208\n", + "Val R^2: 0.27535410112615455\n", "\n", "예측값 분포 (Val):\n", - "count 17774.000000\n", - "mean 2.524266\n", - "std 4.070278\n", - "min -24.999222\n", - "25% -0.114549\n", - "50% 0.894167\n", - "75% 4.298206\n", - "max 23.168346\n", + "count 14411.000000\n", + "mean 2.570728\n", + "std 4.126056\n", + "min -19.284588\n", + "25% -0.116940\n", + "50% 0.918125\n", + "75% 4.393160\n", + "max 24.175064\n", "dtype: float64\n" ] } @@ -641,7 +333,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 622, "id": "1aa8d52b", "metadata": {}, "outputs": [ @@ -650,30 +342,19 @@ "output_type": "stream", "text": [ "== e(x) 성능 (AUC: treatment 모델) ==\n", - "Train AUC: 0.5432668234321524\n", - "Val AUC: 0.5470811168626702\n", + "Train AUC: 0.5\n", + "Val AUC: 0.5\n", "\n", "Propensity e(x) range:\n", - "Train: 0.8259820462034405 → 0.9497856673846461\n", - "Val : 0.8258405409868536 → 0.9441271638409029\n" + "Train: 0.8500001287591226 → 0.8500001287591226\n", + "Val : 0.8500001287591226 → 0.8500001287591226\n" ] } ], "source": [ - "# Propensity model e(x) = P(T=1 | X): Logistic 회귀\n", - "propensity = LogisticRegression(\n", - " penalty=\"l2\",\n", - " C=1.0,\n", - " solver=\"lbfgs\",\n", - " max_iter=1000,\n", - " n_jobs=-1,\n", - ")\n", - "\n", - "propensity.fit(X_train, T_train)\n", - "\n", - "e_train = propensity.predict_proba(X_train)[:, 1]\n", - "e_val = propensity.predict_proba(X_val)[:, 1]\n", - "e_test = propensity.predict_proba(X_test)[:, 1]\n", + "e_train = train_df[\"treatment_propensity\"].values.astype(float)\n", + "e_val = val_df[\"treatment_propensity\"].values.astype(float)\n", + "e_test = test_df[\"treatment_propensity\"].values.astype(float)\n", "\n", "auc_train_e = roc_auc_score(T_train, e_train)\n", "auc_val_e = roc_auc_score(T_val, e_val)\n", @@ -729,7 +410,7 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 623, "id": "28eb5e7c", "metadata": {}, "outputs": [], @@ -742,53 +423,61 @@ " e_tr, e_val,\n", " alpha=1.0,\n", " name=\"R-learner\",\n", + " rt_clip=1e-6,\n", "):\n", " \"\"\"\n", " 선형 τ(x) = w^T x 를 R-learner 방식으로 학습.\n", - " - X_tr, X_val: feature 행렬\n", - " - T_tr, T_val: treatment (0/1)\n", - " - Y_tr, Y_val: outcome (gain or cost)\n", - " - m_tr, m_val: m(x) = E[Y|X] 예측값\n", - " - e_tr, e_val: e(x) = P(T=1|X) 예측값\n", + " rY = (T - e(X)) * τ(X) + ε 를 이용해 w를 추정한다.\n", " \"\"\"\n", - " X_tr = np.asarray(X_tr)\n", - " X_val = np.asarray(X_val)\n", - " T_tr = np.asarray(T_tr).astype(float)\n", - " T_val = np.asarray(T_val).astype(float)\n", - " Y_tr = np.asarray(Y_tr).astype(float)\n", - " Y_val = np.asarray(Y_val).astype(float)\n", - " m_tr = np.asarray(m_tr).astype(float)\n", - " m_val = np.asarray(m_val).astype(float)\n", - " e_tr = np.asarray(e_tr).astype(float)\n", - " e_val = np.asarray(e_val).astype(float)\n", + " X_tr = np.asarray(X_tr, dtype=float)\n", + " X_val = np.asarray(X_val, dtype=float)\n", + " T_tr = np.asarray(T_tr, dtype=float)\n", + " T_val = np.asarray(T_val, dtype=float)\n", + " Y_tr = np.asarray(Y_tr, dtype=float)\n", + " Y_val = np.asarray(Y_val, dtype=float)\n", + " m_tr = np.asarray(m_tr, dtype=float)\n", + " m_val = np.asarray(m_val, dtype=float)\n", + " e_tr = np.asarray(e_tr, dtype=float)\n", + " e_val = np.asarray(e_val, dtype=float)\n", "\n", " # residuals\n", " rY_tr = Y_tr - m_tr\n", " rT_tr = T_tr - e_tr\n", "\n", - " # Z = X * rT (각 행을 rT로 스케일링)\n", + " # rT가 너무 작은 경우 클리핑\n", + " rT_tr = np.where(np.abs(rT_tr) < rt_clip, np.sign(rT_tr) * rt_clip, rT_tr)\n", + "\n", + " # Z = X * rT\n", " Z_tr = X_tr * rT_tr.reshape(-1, 1)\n", "\n", - " # 회귀: rY ~ Z\n", + " # fit\n", " tau_model = Ridge(alpha=alpha, fit_intercept=False, random_state=RANDOM_STATE)\n", " tau_model.fit(Z_tr, rY_tr)\n", "\n", " # τ_hat(x) = w^T x\n", - " tau_tr = tau_model.predict(X_tr)\n", - " tau_val = tau_model.predict(X_val)\n", + " w = tau_model.coef_.reshape(-1)\n", + " tau_tr = X_tr @ w\n", + " tau_val = X_val @ w\n", + "\n", + " # val에서 rY를 얼마나 설명하는지 확인\n", + " rY_val = Y_val - m_val\n", + " rT_val = T_val - e_val\n", + " pred_rY_val = rT_val * tau_val\n", + " mse_val = np.mean((rY_val - pred_rY_val) ** 2)\n", "\n", " print(f\"== {name} 요약 ==\")\n", " print(\"Train τ_hat summary:\")\n", " print(pd.Series(tau_tr).describe())\n", " print(\"\\nVal τ_hat summary:\")\n", " print(pd.Series(tau_val).describe())\n", + " print(f\"\\nVal check: MSE(rY, rT*tau) = {mse_val:.6f}\")\n", "\n", - " return tau_model, tau_tr, tau_val" + " return tau_model, tau_tr, tau_val\n" ] }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 624, "id": "03fc2e68", "metadata": {}, "outputs": [ @@ -798,26 +487,28 @@ "text": [ "== Gain R-learner τ_r(x) 요약 ==\n", "Train τ_hat summary:\n", - "count 72053.000000\n", - "mean 0.869965\n", - "std 4.075175\n", - "min -16.366707\n", - "25% -0.214938\n", - "50% 0.119879\n", - "75% 1.240913\n", - "max 74.068790\n", + "count 43231.000000\n", + "mean 0.744968\n", + "std 4.095001\n", + "min -14.967210\n", + "25% -0.467654\n", + "50% 0.084019\n", + "75% 1.151731\n", + "max 75.742621\n", "dtype: float64\n", "\n", "Val τ_hat summary:\n", - "count 17774.000000\n", - "mean 0.847950\n", - "std 4.047337\n", - "min -13.205720\n", - "25% -0.215324\n", - "50% 0.116316\n", - "75% 1.209327\n", - "max 66.496148\n", - "dtype: float64\n" + "count 14411.000000\n", + "mean 0.805699\n", + "std 4.339214\n", + "min -13.043869\n", + "25% -0.465580\n", + "50% 0.088950\n", + "75% 1.189868\n", + "max 70.995090\n", + "dtype: float64\n", + "\n", + "Val check: MSE(rY, rT*tau) = 91.237132\n" ] } ], @@ -844,7 +535,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 625, "id": "554cb0c4", "metadata": {}, "outputs": [ @@ -854,26 +545,28 @@ "text": [ "== Cost R-learner τ_c(x) 요약 ==\n", "Train τ_hat summary:\n", - "count 72053.000000\n", - "mean 2.873521\n", - "std 4.356113\n", - "min -18.526381\n", - "25% -0.005015\n", - "50% 0.903932\n", - "75% 4.806126\n", - "max 29.188446\n", + "count 43231.000000\n", + "mean 2.810444\n", + "std 4.418757\n", + "min -19.332760\n", + "25% -0.070645\n", + "50% 0.982358\n", + "75% 4.749005\n", + "max 26.477165\n", "dtype: float64\n", "\n", "Val τ_hat summary:\n", - "count 17774.000000\n", - "mean 2.838168\n", - "std 4.375784\n", - "min -28.228684\n", - "25% -0.024353\n", - "50% 0.862495\n", - "75% 4.697582\n", - "max 26.297773\n", - "dtype: float64\n" + "count 14411.000000\n", + "mean 2.817899\n", + "std 4.431292\n", + "min -20.013473\n", + "25% -0.087577\n", + "50% 1.023093\n", + "75% 4.772627\n", + "max 25.653274\n", + "dtype: float64\n", + "\n", + "Val check: MSE(rY, rT*tau) = 40.727902\n" ] } ], @@ -900,7 +593,7 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 626, "id": "01798a5e", "metadata": {}, "outputs": [ @@ -910,78 +603,78 @@ "text": [ "== τ_r(x) 요약 ==\n", "[Train]\n", - "count 72053.000000\n", - "mean 0.869965\n", - "std 4.075175\n", - "min -16.366707\n", - "25% -0.214938\n", - "50% 0.119879\n", - "75% 1.240913\n", - "max 74.068790\n", + "count 43231.000000\n", + "mean 0.744968\n", + "std 4.095001\n", + "min -14.967210\n", + "25% -0.467654\n", + "50% 0.084019\n", + "75% 1.151731\n", + "max 75.742621\n", "dtype: float64\n", "\n", "[Val]\n", - "count 17774.000000\n", - "mean 0.847950\n", - "std 4.047337\n", - "min -13.205720\n", - "25% -0.215324\n", - "50% 0.116316\n", - "75% 1.209327\n", - "max 66.496148\n", + "count 14411.000000\n", + "mean 0.805699\n", + "std 4.339214\n", + "min -13.043869\n", + "25% -0.465580\n", + "50% 0.088950\n", + "75% 1.189868\n", + "max 70.995090\n", "dtype: float64\n", "\n", "[Test]\n", - "count 20333.000000\n", - "mean 2.168109\n", - "std 7.207469\n", - "min -13.727579\n", - "25% -1.233063\n", - "50% 0.928363\n", - "75% 3.340883\n", - "max 76.512638\n", + "count 14411.000000\n", + "mean 0.713794\n", + "std 4.158598\n", + "min -14.772031\n", + "25% -0.479206\n", + "50% 0.054279\n", + "75% 1.114829\n", + "max 61.358405\n", "dtype: float64\n", "\n", "== τ_c(x) 요약 ==\n", "[Train]\n", - "count 72053.000000\n", - "mean 2.873521\n", - "std 4.356113\n", - "min -18.526381\n", - "25% -0.005015\n", - "50% 0.903932\n", - "75% 4.806126\n", - "max 29.188446\n", + "count 43231.000000\n", + "mean 2.810444\n", + "std 4.418757\n", + "min -19.332760\n", + "25% -0.070645\n", + "50% 0.982358\n", + "75% 4.749005\n", + "max 26.477165\n", "dtype: float64\n", "\n", "[Val]\n", - "count 17774.000000\n", - "mean 2.838168\n", - "std 4.375784\n", - "min -28.228684\n", - "25% -0.024353\n", - "50% 0.862495\n", - "75% 4.697582\n", - "max 26.297773\n", + "count 14411.000000\n", + "mean 2.817899\n", + "std 4.431292\n", + "min -20.013473\n", + "25% -0.087577\n", + "50% 1.023093\n", + "75% 4.772627\n", + "max 25.653274\n", "dtype: float64\n", "\n", "[Test]\n", - "count 20333.000000\n", - "mean 8.090442\n", - "std 4.941765\n", - "min -17.154167\n", - "25% 4.801457\n", - "50% 7.960261\n", - "75% 11.408073\n", - "max 26.761067\n", + "count 14411.000000\n", + "mean 2.739378\n", + "std 4.419076\n", + "min -20.425985\n", + "25% -0.102309\n", + "50% 0.915572\n", + "75% 4.637743\n", + "max 28.992231\n", "dtype: float64\n" ] } ], "source": [ "# Test set CATE 예측\n", - "tau_r_test = tau_r_model.predict(X_test)\n", - "tau_c_test = tau_c_model.predict(X_test)\n", + "tau_r_test = X_test @ tau_r_model.coef_.reshape(-1)\n", + "tau_c_test = X_test @ tau_c_model.coef_.reshape(-1)\n", "\n", "print(\"== τ_r(x) 요약 ==\")\n", "print(\"[Train]\")\n", @@ -1055,7 +748,7 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": null, "id": "5fe3e686", "metadata": {}, "outputs": [], @@ -1067,6 +760,7 @@ " lr=1e-5,\n", " n_iter=200,\n", " verbose_every=20,\n", + " scale=1e4\n", "):\n", " \"\"\"\n", " τ_r, τ_c 가 주어졌을 때 Duality gradient ascent로 λ 학습.\n", @@ -1096,7 +790,8 @@ " grad = cost_used - B\n", "\n", " # gradient ascent (λ >= 0 유지)\n", - " lam = max(0.0, lam + lr * grad)\n", + " lr_eff = lr * scale / (total_pos_cost + 1e-12)\n", + " lam = max(0.0, lam + lr_eff * grad)\n", "\n", " if it % verbose_every == 0:\n", " sel_ratio = z.mean()\n", @@ -1114,7 +809,7 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 628, "id": "233a6a45", "metadata": {}, "outputs": [ @@ -1122,21 +817,21 @@ "name": "stdout", "output_type": "stream", "text": [ - "[iter 000] λ=0.873750, cost_used=153674.7514, gain_used=95695.1631, grad=87375.0116, selected=0.571\n", - "[iter 020] λ=0.228848, cost_used=27172.5443, gain_used=61222.1379, grad=-39127.1955, selected=0.150\n", - "[iter 040] λ=0.229153, cost_used=27165.0076, gain_used=61217.4623, grad=-39134.7322, selected=0.150\n", - "[iter 060] λ=0.229200, cost_used=27164.1299, gain_used=61216.9177, grad=-39135.6099, selected=0.150\n", - "[iter 080] λ=0.229252, cost_used=27149.2112, gain_used=61207.6592, grad=-39150.5286, selected=0.150\n", - "[iter 100] λ=0.229233, cost_used=27149.2112, gain_used=61207.6592, grad=-39150.5286, selected=0.150\n", - "[iter 120] λ=0.229137, cost_used=27165.0076, gain_used=61217.4623, grad=-39134.7322, selected=0.150\n", - "[iter 140] λ=0.229207, cost_used=27164.1299, gain_used=61216.9177, grad=-39135.6099, selected=0.150\n", - "[iter 160] λ=0.229252, cost_used=27149.2112, gain_used=61207.6592, grad=-39150.5286, selected=0.150\n", - "[iter 180] λ=0.229143, cost_used=27165.0076, gain_used=61217.4623, grad=-39134.7322, selected=0.150\n", - "[iter 200] λ=0.229180, cost_used=27164.1299, gain_used=61216.9177, grad=-39135.6099, selected=0.150\n", + "[iter 000] λ=0.036519, cost_used=88238.3493, gain_used=56003.9532, grad=48442.7285, selected=0.530\n", + "[iter 020] λ=0.324064, cost_used=46207.1315, gain_used=49466.9730, grad=6411.5107, selected=0.363\n", + "[iter 040] λ=0.366826, cost_used=40887.6177, gain_used=47644.3566, grad=1091.9968, selected=0.333\n", + "[iter 060] λ=0.374218, cost_used=39912.3018, gain_used=47282.8494, grad=116.6809, selected=0.327\n", + "[iter 080] λ=0.374977, cost_used=39803.9172, gain_used=47242.2461, grad=8.2964, selected=0.326\n", + "[iter 100] λ=0.375001, cost_used=39795.3574, gain_used=47239.0362, grad=-0.2634, selected=0.326\n", + "[iter 120] λ=0.375001, cost_used=39795.3574, gain_used=47239.0362, grad=-0.2634, selected=0.326\n", + "[iter 140] λ=0.375002, cost_used=39795.3574, gain_used=47239.0362, grad=-0.2634, selected=0.326\n", + "[iter 160] λ=0.375002, cost_used=39795.3574, gain_used=47239.0362, grad=-0.2634, selected=0.326\n", + "[iter 180] λ=0.375003, cost_used=39795.3574, gain_used=47239.0362, grad=-0.2634, selected=0.326\n", + "[iter 200] λ=0.375003, cost_used=39801.2137, gain_used=47241.2323, grad=5.5929, selected=0.326\n", "\n", - "최종 λ*: 0.22917953894026566\n", - "총 양의 cost effect 합: 220999.1326870586\n", - "예산 B (fraction=0.3): 66299.73980611758\n" + "최종 λ*: 0.3750031632094995\n", + "총 양의 cost effect 합: 132652.0694261761\n", + "예산 B (fraction=0.3): 39795.62082785283\n" ] } ], @@ -1149,51 +844,43 @@ " lr=1e-5,\n", " n_iter=200,\n", " verbose_every=20,\n", + " scale=1e4\n", ")" ] }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 642, "id": "a54ac8cf", "metadata": {}, "outputs": [], "source": [ "def selection_summary(tau_r, tau_c, lam, name=\"\"):\n", - " tau_r = np.asarray(tau_r).astype(float)\n", - " tau_c_pos = np.clip(tau_c, a_min=0.0, a_max=None)\n", + " tau_r = np.asarray(tau_r, float)\n", + " tau_c = np.asarray(tau_c, float)\n", + " tau_c_pos = np.clip(tau_c, 0.0, None)\n", "\n", " s = tau_r - lam * tau_c_pos\n", " z = (s >= 0).astype(float)\n", "\n", - " gain_pos = np.clip(tau_r, 0.0, None)\n", - " cost_pos = np.clip(tau_c, 0.0, None)\n", - "\n", - " gain_used = (gain_pos * z).sum()\n", - " cost_used = (cost_pos * z).sum()\n", + " gain_used = (tau_r * z).sum()\n", + " cost_used = (tau_c_pos * z).sum()\n", " sel_ratio = z.mean()\n", - "\n", " ratio = gain_used / cost_used if cost_used > 0 else np.nan\n", "\n", " print(f\"\\n== Selection summary ({name}) ==\")\n", " print(f\"λ = {lam:.6f}\")\n", " print(f\"선택 비율: {sel_ratio:.3f} ({z.sum():.0f} / {len(z)})\")\n", - " print(f\"총 gain (∑ τ_r^+ z): {gain_used:.4f}\")\n", + " print(f\"총 gain (∑ τ_r z): {gain_used:.4f}\")\n", " print(f\"총 cost (∑ τ_c^+ z): {cost_used:.4f}\")\n", " print(f\"gain / cost 비율: {ratio:.4f}\")\n", "\n", - " return {\n", - " \"lambda\": lam,\n", - " \"selected_ratio\": sel_ratio,\n", - " \"gain_used\": gain_used,\n", - " \"cost_used\": cost_used,\n", - " \"gain_per_cost\": ratio,\n", - " }" + " return {\"lambda\": lam, \"selected_ratio\": sel_ratio, \"gain_used\": gain_used, \"cost_used\": cost_used, \"gain_per_cost\": ratio}\n" ] }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 643, "id": "6947da6a", "metadata": {}, "outputs": [ @@ -1203,25 +890,25 @@ "text": [ "\n", "== Selection summary (Train) ==\n", - "λ = 0.229180\n", - "선택 비율: 0.448 (32281 / 72053)\n", - "총 gain (∑ τ_r^+ z): 89986.7776\n", - "총 cost (∑ τ_c^+ z): 105798.3648\n", - "gain / cost 비율: 0.8505\n", + "λ = 0.375003\n", + "선택 비율: 0.326 (14100 / 43231)\n", + "총 gain (∑ τ_r z): 47239.0362\n", + "총 cost (∑ τ_c^+ z): 39795.3574\n", + "gain / cost 비율: 1.1870\n", "\n", "== Selection summary (Val) ==\n", - "λ = 0.229180\n", - "선택 비율: 0.443 (7877 / 17774)\n", - "총 gain (∑ τ_r^+ z): 21768.6841\n", - "총 cost (∑ τ_c^+ z): 25771.2861\n", - "gain / cost 비율: 0.8447\n", + "λ = 0.375003\n", + "선택 비율: 0.328 (4733 / 14411)\n", + "총 gain (∑ τ_r z): 16722.9333\n", + "총 cost (∑ τ_c^+ z): 13560.8063\n", + "gain / cost 비율: 1.2332\n", "\n", "== Selection summary (Test) ==\n", - "λ = 0.229180\n", - "선택 비율: 0.410 (8345 / 20333)\n", - "총 gain (∑ τ_r^+ z): 60981.3120\n", - "총 cost (∑ τ_c^+ z): 55160.2068\n", - "gain / cost 비율: 1.1055\n" + "λ = 0.375003\n", + "선택 비율: 0.323 (4656 / 14411)\n", + "총 gain (∑ τ_r z): 15984.1864\n", + "총 cost (∑ τ_c^+ z): 12859.2200\n", + "gain / cost 비율: 1.2430\n" ] } ], @@ -1283,65 +970,75 @@ }, { "cell_type": "code", - "execution_count": 33, + "execution_count": 654, "id": "6039a560", "metadata": {}, "outputs": [], "source": [ - "def cost_curve_aucc(scores, Yg, Yc, T, n_points=80):\n", - " \"\"\"\n", - " Paper-style Y-based Cost Curve:\n", - " - sort by score desc\n", - " - for each prefix top-k:\n", - " ATE_gain = mean(Yg|T=1) - mean(Yg|T=0)\n", - " ATE_cost = mean(Yc|T=1) - mean(Yc|T=0)\n", - " ΔGain(k) = n_treat * ATE_gain\n", - " ΔCost(k) = n_treat * ATE_cost\n", - " - normalize (rightmost if possible else max-positive)\n", - " - AUCC = ∫ y dx\n", - " \"\"\"\n", + "def cost_curve_aucc(scores, Yg, Yc, T, W=None, n_points=80, clip_negative_gain=False):\n", " scores = np.asarray(scores, float)\n", " Yg = np.asarray(Yg, float)\n", " Yc = np.asarray(Yc, float)\n", " T = np.asarray(T, int)\n", + " if W is None:\n", + " W = np.ones_like(T, dtype=float)\n", + " else:\n", + " W = np.asarray(W, float)\n", "\n", " order = np.argsort(-scores)\n", - " Yg, Yc, T = Yg[order], Yc[order], T[order]\n", + " Yg, Yc, T, W = Yg[order], Yc[order], T[order], W[order]\n", "\n", " N = len(T)\n", " ks = np.linspace(1, N, n_points, dtype=int)\n", "\n", - " inc_g, inc_c = [0.0], [0.0] # include (0,0)\n", + " def wmean(y, w):\n", + " return (y * w).sum() / (w.sum() + 1e-12)\n", + "\n", + " inc_g, inc_c = [0.0], [0.0]\n", " for k in ks:\n", - " T_k, Yg_k, Yc_k = T[:k], Yg[:k], Yc[:k]\n", + " T_k, Yg_k, Yc_k, W_k = T[:k], Yg[:k], Yc[:k], W[:k]\n", " mt, mc = (T_k == 1), (T_k == 0)\n", "\n", " if mt.sum() == 0 or mc.sum() == 0:\n", " inc_g.append(0.0); inc_c.append(0.0); continue\n", "\n", - " ate_g = Yg_k[mt].mean() - Yg_k[mc].mean()\n", - " ate_c = Yc_k[mt].mean() - Yc_k[mc].mean()\n", - " n_t = mt.sum()\n", + " ate_g = wmean(Yg_k[mt], W_k[mt]) - wmean(Yg_k[mc], W_k[mc])\n", + " ate_c = wmean(Yc_k[mt], W_k[mt]) - wmean(Yc_k[mc], W_k[mc])\n", + "\n", + " \n", + " w_t = W_k[mt].sum()\n", + " inc_g.append(ate_g * w_t)\n", + " inc_c.append(ate_c * w_t)\n", "\n", - " inc_g.append(ate_g * n_t)\n", - " inc_c.append(ate_c * n_t)\n", + " inc_g = np.asarray(inc_g, float)\n", + " if clip_negative_gain:\n", + " inc_g = np.maximum(inc_g, 0.0)\n", "\n", - " inc_g = np.maximum(np.asarray(inc_g, float), 0.0)\n", - " inc_c = np.asarray(inc_c, float)\n", + " # cost는 음수면 0으로 (안전)\n", + " inc_c = np.maximum(np.asarray(inc_c, float), 0.0)\n", "\n", " max_g, max_c = inc_g[-1], inc_c[-1]\n", - " if max_g <= 0 or max_c <= 0:\n", - " max_g = inc_g[inc_g > 0].max() if np.any(inc_g > 0) else 1.0\n", - " max_c = inc_c[inc_c > 0].max() if np.any(inc_c > 0) else 1.0\n", + " if max_g == 0:\n", + " max_g = np.max(np.abs(inc_g)) if np.max(np.abs(inc_g)) > 0 else 1.0\n", + " if max_c == 0:\n", + " max_c = np.max(inc_c) if np.max(inc_c) > 0 else 1.0\n", "\n", " x = inc_c / max_c\n", " y = inc_g / max_g\n", "\n", - " si = np.argsort(x)\n", - " aucc = np.trapz(y[si], x[si])\n", - " return x, y, aucc\n", - "\n", - "def plot_cost_curve(x, y, aucc, title=\"Cost Curve (Paper-style, Y-based)\", label=\"Model\"):\n", + " x = np.maximum.accumulate(x)\n", + " aucc = np.trapz(y, x)\n", + " return x, y, aucc\n" + ] + }, + { + "cell_type": "code", + "execution_count": 655, + "id": "a60166d5", + "metadata": {}, + "outputs": [], + "source": [ + "def plot_cost_curve(x, y, aucc, title=\"Cost Curve\", label=\"Model\"):\n", " plt.figure(figsize=(7, 6))\n", " plt.plot(x, y, label=f\"{label} (AUCC={aucc:.3f})\")\n", " plt.plot([0, 1], [0, 1], alpha=0.35, linewidth=1, label=\"y=x benchmark\")\n", @@ -1356,7 +1053,7 @@ }, { "cell_type": "code", - "execution_count": 34, + "execution_count": 656, "id": "2b41ea6a", "metadata": {}, "outputs": [ @@ -1364,12 +1061,12 @@ "name": "stdout", "output_type": "stream", "text": [ - "Duality AUCC: 0.6649279978825946\n" + "Duality AUCC: 0.6109516208594291\n" ] }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -1379,17 +1076,51 @@ } ], "source": [ - "scores_duality = tau_r_test - lambda_star * tau_c_test\n", - "x, y, aucc = cost_curve_aucc(scores_duality, Yg_test, Yc_test, T_test, n_points=80)\n", + "tau_c_test_pos = np.clip(tau_c_test, 0.0, None)\n", + "scores_duality = tau_r_test - lambda_star * tau_c_test_pos\n", + "x, y, aucc = cost_curve_aucc(scores_duality, Yg_test, Yc_test, T_test, W=W_test, n_points=80)\n", + "\n", "\n", "print(\"Duality AUCC:\", aucc)\n", "plot_cost_curve(x, y, aucc, title=\"Cost Curve on Test set\", label=\"Duality\")" ] }, + { + "cell_type": "code", + "execution_count": 651, + "id": "1d1e4af1", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "train cost_used: 39795.357426534225 selected: 0.32615484258980826\n", + "val cost_used: 13560.806268071696 selected: 0.3284296717785025\n", + "test cost_used: 12859.22000170533 selected: 0.32308653112205954\n", + "B_train: 39795.62082785283\n" + ] + } + ], + "source": [ + "def cost_used_under_policy(tau_r, tau_c, lam):\n", + " tc = np.clip(tau_c, 0.0, None)\n", + " s = tau_r - lam * tc\n", + " z = (s >= 0).astype(float)\n", + " return (tc * z).sum(), z.mean()\n", + "\n", + "for name, tr, tc in [(\"train\", tau_r_train, tau_c_train),\n", + " (\"val\", tau_r_val, tau_c_val),\n", + " (\"test\", tau_r_test, tau_c_test)]:\n", + " c, sel = cost_used_under_policy(tr, tc, lambda_star)\n", + " print(name, \"cost_used:\", c, \"selected:\", sel)\n", + "print(\"B_train:\", B_train)\n" + ] + }, { "cell_type": "code", "execution_count": null, - "id": "d36a48d4", + "id": "220d5e1f", "metadata": {}, "outputs": [], "source": [] From a0278b6fc0e9c0f8f108e6eb97857e8b29650999 Mon Sep 17 00:00:00 2001 From: Funbucket Date: Sun, 4 Jan 2026 18:42:26 +0900 Subject: [PATCH 5/5] add: fractional uplift --- book/_toc.yml | 3 +- ... => budget_constrained_optimization.ipynb} | 10 +- .../fractional_uplift.ipynb | 1994 +++++++++++++++++ 3 files changed, 2005 insertions(+), 2 deletions(-) rename book/prescriptive_analytics/{heterogeneous_causal_learning_for_effectiveness_optimization.ipynb => budget_constrained_optimization.ipynb} (99%) create mode 100644 book/prescriptive_analytics/fractional_uplift.ipynb diff --git a/book/_toc.yml b/book/_toc.yml index 4b59776..06ab718 100644 --- a/book/_toc.yml +++ b/book/_toc.yml @@ -26,4 +26,5 @@ parts: - file: scm/causal_discovery.ipynb - file: prescriptive_analytics/overview.md sections: - - file: prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb \ No newline at end of file + - file: prescriptive_analytics/fractional_uplift.ipynb + - file: prescriptive_analytics/budget_constrained_optimization.ipynb \ No newline at end of file diff --git a/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb b/book/prescriptive_analytics/budget_constrained_optimization.ipynb similarity index 99% rename from book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb rename to book/prescriptive_analytics/budget_constrained_optimization.ipynb index c7f6f34..7b69d3f 100644 --- a/book/prescriptive_analytics/heterogeneous_causal_learning_for_effectiveness_optimization.ipynb +++ b/book/prescriptive_analytics/budget_constrained_optimization.ipynb @@ -5,7 +5,15 @@ "id": "d6b70f64", "metadata": {}, "source": [ - "# Heterogeneous Causal Learning for Effectiveness Optimization" + "# Budget Constrained Optimization" + ] + }, + { + "cell_type": "markdown", + "id": "826633ac", + "metadata": {}, + "source": [ + "> [Heterogeneous Causal Learning for Effectiveness Optimization in User Marketing](https://arxiv.org/abs/2004.09702) 논문에 제안된 방법을 Python 코드로 재현했습니다." ] }, { diff --git a/book/prescriptive_analytics/fractional_uplift.ipynb b/book/prescriptive_analytics/fractional_uplift.ipynb new file mode 100644 index 0000000..412a666 --- /dev/null +++ b/book/prescriptive_analytics/fractional_uplift.ipynb @@ -0,0 +1,1994 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "8530047b", + "metadata": {}, + "source": [ + "# Fractional uplift modelling" + ] + }, + { + "cell_type": "markdown", + "id": "91104063", + "metadata": {}, + "source": [ + "> Google LLC의 [Fractional Uplift – End to End Example](https://github.com/google-marketing-solutions/fractional_uplift)를 한국어로 번역 및 정리했습니다.\n" + ] + }, + { + "cell_type": "markdown", + "id": "601fb081", + "metadata": {}, + "source": [ + "fractional uplift 모델은 프로모션 비용이 사전에 확정되지 않는 상황에 적합합니다. \n", + "\n", + "예를 들어, 특정 조건이 있는 쿠폰을 제공하는 경우 쿠폰을 제공하는 시점에는 실제 비용을 알 수 없으며, 비용은 사용자가 어떤 상품을 구매하느냐에 따라 달라집니다.\n", + "\n", + "이러한 상황에서 일반적인 uplift 모델링은 한계가 있습니다. 일반 uplift 모델은 처치로 인한 증분 효과(incrementality)만을 다루고, 비용은 고려하지 않기 때문입니다. \n", + "\n", + "반면 fractional uplift 모델은 여러 지표를 함께 고려하여 최적화하도록 설계되어 있습니다.\n", + "\n", + "일반적인 uplift 모델은 하나의 지표(예: 전환율 또는 지출 금액)에 대해 조건부 평균 처치 효과(CATE)를 추정합니다.\n", + "\n", + "$$\n", + "f_\\text{uplift}(X) = \\text{CATE}_y(X) = E[y \\mid T=1, X] - E[y \\mid T=0, X]\n", + "$$\n", + "\n", + "fractional uplift 모델은 여러 지표에 대한 CATE 추정치를 결합하여, 다음과 같은 단일 점수를 계산합니다.\n", + "\n", + "$$\n", + "f_\\text{fractional uplift}(X) =\n", + "\\begin{cases}\n", + " \\dfrac{\\text{CATE}_\\alpha (X)}\n", + " {\\text{CATE}_\\beta(X) - \\dfrac{\\text{CATE}_\\gamma (X)}{\\delta}},\n", + " & \\text{CATE}_\\beta(X) > \\dfrac{\\text{CATE}_\\gamma (X)}{\\delta} \\\\\n", + " \\infty, & \\text{그 외의 경우}\n", + "\\end{cases}\n", + "$$\n", + "\n", + "각 항의 의미는 다음과 같습니다.\n", + "\n", + "- $\\alpha$ (Maximize KPI) \n", + " 모델이 최대화하고자 하는 지표입니다.\n", + "\n", + "- $\\beta$ (Constraint KPI) \n", + " 제약으로 작용하는 지표로, 가능한 한 낮게 유지하고자 합니다.\n", + "\n", + "- $\\gamma$ (Constraint Offset KPI) \n", + " 제약을 상쇄하는 데 사용할 수 있는 지표이며, 선택적으로 사용됩니다.\n", + "\n", + "- $\\delta$ (Constraint Offset Scale) \n", + " constraint offset KPI의 스케일을 조정하는 상수입니다. \n", + " constraint offset KPI를 사용하지 않는 경우에는 필요하지 않습니다.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "7f3eb8c5", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33mWARNING: There was an error checking the latest version of pip.\u001b[0m\u001b[33m\n", + "\u001b[0mNote: you may need to restart the kernel to use updated packages.\n" + ] + } + ], + "source": [ + "%pip install -q fractional-uplift pandas numpy statsmodels tensorflow tensorflow_decision_forests matplotlib" + ] + }, + { + "cell_type": "code", + "execution_count": 134, + "id": "e4d6d469", + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "\n", + "os.environ[\"OMP_NUM_THREADS\"] = \"8\"\n", + "os.environ[\"TF_NUM_INTRAOP_THREADS\"] = \"8\"\n", + "os.environ[\"TF_NUM_INTEROP_THREADS\"] = \"2\"" + ] + }, + { + "cell_type": "code", + "execution_count": 135, + "id": "97a2ff86", + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "import statsmodels.api as sm\n", + "\n", + "import matplotlib.pyplot as plt\n", + "import matplotlib.ticker as mtick\n", + "from matplotlib.lines import Line2D\n", + "\n", + "import tensorflow as tf # tensorflow decision forests가 eager mode로 실행되도록 하기 위해 필요\n", + "\n", + "import warnings\n", + "warnings.filterwarnings(\"ignore\")" + ] + }, + { + "cell_type": "code", + "execution_count": 136, + "id": "77fb1695", + "metadata": {}, + "outputs": [], + "source": [ + "import tensorflow_decision_forests as tfdf\n", + "import fractional_uplift as fr" + ] + }, + { + "cell_type": "code", + "execution_count": 137, + "id": "27a1ae4b", + "metadata": {}, + "outputs": [], + "source": [ + "# tfdf 사용 시 출력되는 Keras 학습 로그 숨김\n", + "tfdf.keras.set_training_logs_redirection(False)" + ] + }, + { + "cell_type": "markdown", + "id": "acd5db70", + "metadata": {}, + "source": [ + "## Load the Criteo data\n", + "\n", + "[Criteo dataset](https://ailab.criteo.com/criteo-uplift-prediction-dataset/)은 uplift 모델링을 벤치마킹하기위한 공개된 데이터셋입니다. 여러 incrementality 테스트 결과를 모아 구성되었으며, 각 행은 사용자 한 명을 나타냅니다.\n", + "\n", + "데이터셋에는 다음 정보가 포함되어 있습니다.\n", + "\n", + "- 사용자 특성(feature) 11개\n", + "- 처치 여부(treatment)\n", + "- 결과 라벨 2개: 방문(visits), 전환(conversions)\n", + "\n", + "이 데이터셋은 전환이나 방문과 같은 단일 KPI를 대상으로 하는 표준 uplift 모델링 문제를 위해 설계되었습니다. \n", + "\n", + "그러나 이 노트북에서는 사용자에게 프로모션을 제공하는 상황을 가정하여, 보다 현실적인 uplift 모델링 문제를 다룹니다.\n", + "\n", + "이를 위해 다음과 같은 추가 지표를 사용합니다.\n", + "\n", + "- **Spend**: 사용자가 전환했을 때 지출한 금액을 의미하며, 특성(feature)을 기반으로 생성됩니다.\n", + "\n", + "- **Cost:** 또는 쿠폰 비용을 의미합니다. 사용자가 전환한 경우에만 발생하며, 처치된 사용자(T=1)에게서 전환이 발생했을 때에만 비용이 발생하도록 설정합니다." + ] + }, + { + "cell_type": "code", + "execution_count": 138, + "id": "7ee1e798", + "metadata": {}, + "outputs": [], + "source": [ + "criteo = fr.example_data.CriteoWithSyntheticCostAndSpend.load()" + ] + }, + { + "cell_type": "code", + "execution_count": 139, + "id": "92c3c0ab", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
f0f1f2f3f4f5f6f7f8f9f10f11treatmentconversiontreatment_propensitycost_percentagespendcostsample_weight
4412.61636510.0596548.9645884.67988210.2805254.1154530.2944434.8338153.95539613.1900565.300375-0.168679100.850.0000000.0000000.000000100.0
18712.61636510.0596548.9045974.67988210.2805254.1154530.2944434.8338153.95539613.1900565.300375-0.168679100.850.0000000.0000000.000000100.0
48422.37723810.0596548.2143834.67988210.2805254.115453-2.4111154.8338153.97185813.1900565.300375-0.168679100.850.0000000.0000000.000000100.0
52812.61636510.0596548.3506824.67988210.2805254.1154530.2944434.8338153.95539616.2260445.300375-0.168679100.850.0000000.0000000.000000100.0
110814.61762710.0596548.4899293.90766213.2538134.115453-2.4111154.8338153.80953042.1763245.737292-0.560340110.850.09077736.4592943.3096551.0
\n", + "
" + ], + "text/plain": [ + " f0 f1 f2 f3 f4 f5 f6 \\\n", + "44 12.616365 10.059654 8.964588 4.679882 10.280525 4.115453 0.294443 \n", + "187 12.616365 10.059654 8.904597 4.679882 10.280525 4.115453 0.294443 \n", + "484 22.377238 10.059654 8.214383 4.679882 10.280525 4.115453 -2.411115 \n", + "528 12.616365 10.059654 8.350682 4.679882 10.280525 4.115453 0.294443 \n", + "1108 14.617627 10.059654 8.489929 3.907662 13.253813 4.115453 -2.411115 \n", + "\n", + " f7 f8 f9 f10 f11 treatment \\\n", + "44 4.833815 3.955396 13.190056 5.300375 -0.168679 1 \n", + "187 4.833815 3.955396 13.190056 5.300375 -0.168679 1 \n", + "484 4.833815 3.971858 13.190056 5.300375 -0.168679 1 \n", + "528 4.833815 3.955396 16.226044 5.300375 -0.168679 1 \n", + "1108 4.833815 3.809530 42.176324 5.737292 -0.560340 1 \n", + "\n", + " conversion treatment_propensity cost_percentage spend cost \\\n", + "44 0 0.85 0.000000 0.000000 0.000000 \n", + "187 0 0.85 0.000000 0.000000 0.000000 \n", + "484 0 0.85 0.000000 0.000000 0.000000 \n", + "528 0 0.85 0.000000 0.000000 0.000000 \n", + "1108 1 0.85 0.090777 36.459294 3.309655 \n", + "\n", + " sample_weight \n", + "44 100.0 \n", + "187 100.0 \n", + "484 100.0 \n", + "528 100.0 \n", + "1108 1.0 " + ] + }, + "execution_count": 139, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "criteo.train_data.head()" + ] + }, + { + "cell_type": "markdown", + "id": "bdef041d", + "metadata": {}, + "source": [ + "## Experiment Analysis\n", + "\n", + "먼저 쿠폰 제공의 전체적인 효과를 일반적인 A/B 테스트로 분석합니다. \n", + "\n", + "사용자에게 구매 금액에 대한 할인을 제공하는 쿠폰을 제공하므로, 캠페인의 증분 RoI(incremental RoI, iRoI)를 평가합니다.\n", + "\n", + "iRoI는 다음과 같이 정의합니다.\n", + "\n", + "$$\n", + "\\text{iRoI} = \\frac{\\text{Spend}_{T=1} - \\text{Spend}_{T=0}}{\\text{Cost}_{T=1}}\n", + "$$\n", + "\n", + "최소제곱법(ordinary least squares)과 델타 방법(delta method)을 사용하여 iRoI와 신뢰구간을 추정합니다.\n", + "\n", + "**주의**: 델타 방법을 이용한 iRoI 신뢰구간 추정은 처치군에서의 spend와 cost 간 상관관계를 고려하지 않습니다. 이로 인해 신뢰구간이 실제보다 넓게 추정될 수 있습니다. 다만 이는 핵심 주제가 아니므로, 단순화를 위해 해당 방식을 그대로 사용합니다.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 140, + "id": "7c472fde", + "metadata": {}, + "outputs": [], + "source": [ + "def perform_ols(input_df: pd.DataFrame, target: str) -> tuple[float, float]:\n", + " \"\"\"statsmodels를 사용하여 최소제곱법(OLS) 회귀를 수행합니다.\n", + "\n", + " target 변수를 처치 여부(treatment)에 회귀시키며,\n", + " 이는 두 집단 간 평균 차이를 검정하는 t-test와 동일한 역할을 합니다.\n", + "\n", + " Args:\n", + " input_df: 회귀 분석에 사용할 데이터프레임.\n", + " target: 효과를 추정할 대상 변수명.\n", + "\n", + " Returns:\n", + " (effect_size, standard_error) 튜플을 반환합니다.\n", + " - effect_size: 처치 효과(평균 차이)\n", + " - standard_error: 처치 효과의 표준오차\n", + " \"\"\"\n", + " Y = input_df[target].values\n", + " X = input_df[[\"treatment\"]]\n", + " X = sm.add_constant(X)\n", + " model = sm.OLS(Y, X)\n", + " results = model.fit()\n", + "\n", + " effect_size = results.params[\"treatment\"]\n", + " se = results.HC0_se[\"treatment\"]\n", + " return effect_size, se\n", + "\n", + "\n", + "def estimate_incremental_roi(data: pd.DataFrame) -> tuple[float, float, float]:\n", + " \"\"\"델타 방법(delta method)을 사용하여 증분 RoI와 그 불확실성을 추정합니다.\n", + "\n", + " 이 방법은 처치군 사용자에서 spend와 cost 간의 상관관계를\n", + " 고려하지 않기 때문에 정확한 추정은 아닙니다.\n", + " 실제로는 신뢰구간의 폭이 더 좁아질 수 있으며,\n", + " 여기서 계산된 값은 보수적인 추정치입니다.\n", + "\n", + " Args:\n", + " data: 분석에 사용할 데이터프레임.\n", + "\n", + " Returns:\n", + " (incremental_roi, lower_bound, upper_bound)를 반환합니다.\n", + " - incremental_roi: 증분 RoI 추정값\n", + " - lower_bound: 신뢰구간 하한\n", + " - upper_bound: 신뢰구간 상한\n", + " \"\"\"\n", + " effect_size_spend, spend_se = perform_ols(data, \"spend\")\n", + " avg_cost = data.loc[data[\"treatment\"] == 1, \"cost\"].mean()\n", + " cost_se = (\n", + " data.loc[data[\"treatment\"] == 1, \"cost\"].std()\n", + " / np.sqrt(np.sum(data[\"treatment\"]))\n", + " )\n", + "\n", + " inc_roi = effect_size_spend / avg_cost\n", + " inc_roi_se = np.abs(inc_roi) * np.sqrt(\n", + " spend_se**2 / effect_size_spend**2\n", + " + cost_se**2 / avg_cost**2\n", + " )\n", + "\n", + " inc_roi_lb = inc_roi - 2.0 * inc_roi_se\n", + " inc_roi_ub = inc_roi + 2.0 * inc_roi_se\n", + "\n", + " return inc_roi, inc_roi_lb, inc_roi_ub" + ] + }, + { + "cell_type": "code", + "execution_count": 141, + "id": "b78d9b3b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Incremental RoI = 0.97 [Lower Bound=0.90, Upper Bound=1.03]\n" + ] + } + ], + "source": [ + "inc_roi, inc_roi_lb, inc_roi_ub = estimate_incremental_roi(criteo.data)\n", + "print(\n", + " f\"Incremental RoI = {inc_roi:.2f} \"\n", + " f\"[Lower Bound={inc_roi_lb:.2f}, Upper Bound={inc_roi_ub:.2f}]\"\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "664e5bd8", + "metadata": {}, + "source": [ + "처치가 분명하고 측정 가능한 효과를 보이는 것으로 보입니다(iRoI의 하한이 0보다 충분히 큼). \n", + "\n", + "그러나 동시에 비용도 높아, RoI가 확실하게 1.0을 넘는다고 말하기는 어렵습니다. 즉, 이 프로모션으로 손실이 발생하고 있을 가능성도 있습니다.\n", + "\n", + "이러한 상황이 바로 uplift 모델링에 적합한 경우입니다. 프로모션을 적절한 사용자에게만 타겟팅한다면, 더 높은 iRoI를 달성할 수 있을 것입니다." + ] + }, + { + "cell_type": "markdown", + "id": "3fd3f766", + "metadata": {}, + "source": [ + "## Uplift Modelling\n", + "\n", + "이제 이 프로모션 캠페인을 최적화하기 위해 다양한 uplift 모델을 학습합니다.\n" + ] + }, + { + "cell_type": "markdown", + "id": "8ea86385", + "metadata": {}, + "source": [ + "### Distillation\n", + "\n", + "이 노트북에서 사용하는 uplift 모델들은 모두 meta learner입니다. 즉, 여러 개의 머신러닝 모델을 조합해 하나의 uplift 모델을 구성합니다.\n", + "\n", + "예를 들어, 아래에서 설명할 T-Learner는 처치군의 반응을 예측하는 모델과 대조군의 반응을 예측하는 모델, 총 두 개의 모델로 이루어져 있습니다.\n", + "\n", + "이처럼 여러 모델을 함께 사용하면, 실제 서비스 환경에서 추론 지연(latency)이 발생할 수 있습니다. \n", + "\n", + "이를 해결하기 위해 fractional uplift 패키지의 모든 메타 러너는 distill 메서드를 제공합니다. 이 메서드는 전체 uplift 모델을 근사하는 단일 모델을 생성합니다.\n", + "\n", + "아래 예제에서는 distillation을 적용한 모델과 적용하지 않은 모델의 성능을 함께 비교합니다." + ] + }, + { + "cell_type": "markdown", + "id": "868108a3", + "metadata": {}, + "source": [ + "### The T-Learner (baseline)\n", + "\n", + "fractional uplift 모델과 비교하기 위한 기준선(baseline)으로 기존 uplift 모델을 먼저 사용합니다.\n", + "\n", + "여기서는 비교적 단순하면서도 성능이 안정적인 T-Learner를 사용합니다. T-Learner는 하나의 KPI에 대한 uplift를 추정하기 위해, 대조군과 처치군 데이터를 각각 사용해 두 개의 모델을 학습하고, 두 예측값의 차이를 uplift로 계산합니다.\n", + "\n", + "전환(conversion)과 매출(spend)에 대한 uplift를 각각 추정하는 두 개의 T-Learner를 학습합니다." + ] + }, + { + "cell_type": "code", + "execution_count": 142, + "id": "d2f0d38c", + "metadata": {}, + "outputs": [], + "source": [ + "test_dataset = fr.datasets.PandasDataset(\n", + " features_data=criteo.test_data[criteo.features]\n", + ")\n", + "distill_dataset = fr.datasets.PandasDataset(\n", + " features_data=criteo.distill_data[criteo.features]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 143, + "id": "6ba9653f", + "metadata": {}, + "outputs": [], + "source": [ + "def get_base_regressor():\n", + " # 예시로 사용하는 기본 회귀 모델입니다.\n", + " # 실제 프로젝트에서는 tuning 인자를 사용해 하이퍼파라미터 튜닝을 수행하는 것이 좋습니다.\n", + " # 자세한 내용은 tensorflow decision forests 문서를 참고하십시오.\n", + " return fr.base_models.TensorflowDecisionForestRegressor(\n", + " tfdf.keras.GradientBoostedTreesModel,\n", + " init_args=dict(verbose=0, max_depth=6, num_trees=300, shrinkage=0.1),\n", + " fit_args=dict(verbose=0)\n", + " )" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "414804f0", + "metadata": {}, + "outputs": [], + "source": [ + "for target in [\"conversion\", \"spend\"]:\n", + " print(f\"\\nTraining the {target} T-learner\\n\")\n", + "\n", + " train_data = fr.datasets.PandasTrainData(\n", + " features_data=criteo.train_data[criteo.features],\n", + " maximize_kpi=criteo.train_data[target].values,\n", + " is_treated=criteo.train_data[\"treatment\"].values,\n", + " treatment_propensity=criteo.train_data[\"treatment_propensity\"].values,\n", + " sample_weight=criteo.train_data[\"sample_weight\"].values,\n", + " shuffle_seed=1234\n", + " )\n", + "\n", + " t_learner = fr.meta_learners.TLearner(get_base_regressor())\n", + " t_learner.fit(train_data)\n", + "\n", + " distill_t_learner = get_base_regressor()\n", + " t_learner.distill(distill_dataset, distill_t_learner)\n", + "\n", + " criteo.test_data[f\"{target}_t_learner_score\"] = t_learner.predict(test_dataset)\n", + " criteo.test_data[f\"{target}_t_learner_score_distill\"] = distill_t_learner.predict(test_dataset)" + ] + }, + { + "cell_type": "code", + "execution_count": 145, + "id": "32668b60", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
f0f1f2f3f4f5f6f7f8f9...conversiontreatment_propensitycost_percentagespendcostsample_weightconversion_t_learner_scoreconversion_t_learner_score_distillspend_t_learner_scorespend_t_learner_score_distill
100713.22150910.0596548.471764-1.66639613.0591692.230907-14.4763629.1703243.79394639.917532...10.850.05109137.3868481.9101231.00.0351700.0539630.2172230.484556
108015.64377210.0596548.2328223.90766211.0295844.115453-1.2882074.8338153.85804134.180688...10.850.46932337.69064917.6890781.00.0157090.0153310.6613320.196871
129513.23681211.1193098.329031-2.57001511.5610501.128518-18.6596765.8146413.85565234.621266...10.850.00747845.2827600.3386391.0-0.233079-0.195807-12.409683-8.438211
167019.62257410.0596548.3052113.90766213.2538134.115453-1.2882074.8338153.80332441.176485...10.850.01095844.8727550.4917141.0-0.137972-0.126640-10.253737-9.813921
194012.61636510.0596548.6819764.67988211.0295844.1154530.2944434.8338153.86471113.190056...10.850.99362219.27487719.1519401.00.0023560.0011060.003596-0.081532
\n", + "

5 rows × 23 columns

\n", + "
" + ], + "text/plain": [ + " f0 f1 f2 f3 f4 f5 \\\n", + "1007 13.221509 10.059654 8.471764 -1.666396 13.059169 2.230907 \n", + "1080 15.643772 10.059654 8.232822 3.907662 11.029584 4.115453 \n", + "1295 13.236812 11.119309 8.329031 -2.570015 11.561050 1.128518 \n", + "1670 19.622574 10.059654 8.305211 3.907662 13.253813 4.115453 \n", + "1940 12.616365 10.059654 8.681976 4.679882 11.029584 4.115453 \n", + "\n", + " f6 f7 f8 f9 ... conversion \\\n", + "1007 -14.476362 9.170324 3.793946 39.917532 ... 1 \n", + "1080 -1.288207 4.833815 3.858041 34.180688 ... 1 \n", + "1295 -18.659676 5.814641 3.855652 34.621266 ... 1 \n", + "1670 -1.288207 4.833815 3.803324 41.176485 ... 1 \n", + "1940 0.294443 4.833815 3.864711 13.190056 ... 1 \n", + "\n", + " treatment_propensity cost_percentage spend cost \\\n", + "1007 0.85 0.051091 37.386848 1.910123 \n", + "1080 0.85 0.469323 37.690649 17.689078 \n", + "1295 0.85 0.007478 45.282760 0.338639 \n", + "1670 0.85 0.010958 44.872755 0.491714 \n", + "1940 0.85 0.993622 19.274877 19.151940 \n", + "\n", + " sample_weight conversion_t_learner_score \\\n", + "1007 1.0 0.035170 \n", + "1080 1.0 0.015709 \n", + "1295 1.0 -0.233079 \n", + "1670 1.0 -0.137972 \n", + "1940 1.0 0.002356 \n", + "\n", + " conversion_t_learner_score_distill spend_t_learner_score \\\n", + "1007 0.053963 0.217223 \n", + "1080 0.015331 0.661332 \n", + "1295 -0.195807 -12.409683 \n", + "1670 -0.126640 -10.253737 \n", + "1940 0.001106 0.003596 \n", + "\n", + " spend_t_learner_score_distill \n", + "1007 0.484556 \n", + "1080 0.196871 \n", + "1295 -8.438211 \n", + "1670 -9.813921 \n", + "1940 -0.081532 \n", + "\n", + "[5 rows x 23 columns]" + ] + }, + "execution_count": 145, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "criteo.test_data.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 146, + "id": "6c84863c", + "metadata": {}, + "outputs": [], + "source": [ + "def plot_cumulative_incrementality(\n", + " ax: plt.Axes,\n", + " results_data: pd.DataFrame,\n", + " model_names: dict[str, str],\n", + " x_col: str,\n", + " y_col: str,\n", + " title: str = \"\",\n", + " x_label: str | None = None,\n", + " y_label: str | None = None,\n", + " x_format: str = \"{0}\",\n", + " y_format: str = \"{0}\",\n", + " order_col: str = \"share_targeted\",\n", + " random_baseline_name: str = \"random\",\n", + " x_lim: list[float] | None = None,\n", + " y_lim: list[float] | None = None,\n", + " show_legend: bool = True,\n", + " ) -> None:\n", + " \"\"\"Plots the cumulative incrementality of any x and y metrics.\"\"\"\n", + "\n", + " baselines_data = results_data.loc[results_data.name == random_baseline_name].copy().sort_values(order_col)\n", + " models_data = results_data.loc[results_data.name.isin(model_names)].copy().sort_values(order_col)\n", + "\n", + " raw_model_names = list(model_names.keys())\n", + " clean_model_names = list(model_names.values())\n", + "\n", + " ax.plot(baselines_data[x_col], baselines_data[y_col], color=\"k\", lw=1, label=\"\")\n", + "\n", + " for raw_model_name, model_results in models_data.groupby(\"name\"):\n", + "\n", + " if raw_model_name.endswith(\"_distill\"):\n", + " raw_model_name = raw_model_name.removesuffix(\"_distill\")\n", + " label = \"\"\n", + " line_style = \"--\"\n", + " else:\n", + " label = model_names[raw_model_name]\n", + " line_style = \"-\"\n", + "\n", + " color = f\"C{raw_model_names.index(raw_model_name)}\"\n", + " ax.plot(model_results[x_col], model_results[y_col], color=color, lw=1.5, label=label, ls=line_style)\n", + "\n", + " if x_lim is not None:\n", + " ax.set_xlim(x_lim)\n", + " if y_lim is not None:\n", + " ax.set_ylim(y_lim)\n", + "\n", + " ax.set_xlabel(x_label or x_col)\n", + " ax.set_ylabel(y_label or y_col)\n", + " ax.set_title(title)\n", + "\n", + " ax.xaxis.set_major_formatter(mtick.FuncFormatter(lambda x, pos: x_format.format(x)))\n", + " ax.yaxis.set_major_formatter(mtick.FuncFormatter(lambda y, pos: y_format.format(y)))\n", + "\n", + " # Add legend\n", + " if show_legend:\n", + " handles, labels = ax.get_legend_handles_labels()\n", + " handles.extend([\n", + " Line2D([0], [0], alpha=0.0),\n", + " Line2D([0], [0], color=\"0.7\", lw=1.5, ls=\"-\"),\n", + " Line2D([0], [0], color=\"0.7\", lw=1.5, ls=\"--\")\n", + " ])\n", + " labels.extend([\n", + " \"\",\n", + " \"Full model\",\n", + " \"Distilled model\"\n", + " ])\n", + " model_legend = ax.legend(\n", + " handles=handles,\n", + " labels=labels,\n", + " loc='upper left',\n", + " bbox_to_anchor=(1, 1)\n", + " )\n", + "\n", + " del(baselines_data)\n", + " del(models_data)" + ] + }, + { + "cell_type": "markdown", + "id": "3418e936", + "metadata": {}, + "source": [ + "이제 두 개의 T-Learner를 평가합니다. 평가는 모델이 타깃으로 선택한 사용자 비율에 따라 incremental spend과 incremental conversions이 어떻게 증가하는지를 시각화하는 방식으로 수행합니다.\n", + "\n", + "만약 모델이 사용자를 무작위로 선택한다면, 전체 사용자 중 50%를 타깃팅했을 때 incremental conversions과 incremental spend 역시 전체의 약 50% 수준에 그칠 것입니다.\n", + "\n", + "반면 uplift 모델이 제대로 학습되었다면, 사용자 50%를 타깃팅했을 때 50%를 초과하는 incremental conversions과 incremental spend을 기대할 수 있습니다.\n", + "\n", + "아래에 제시된 uplift 곡선은 타깃 사용자 비율 전 구간에 걸쳐 이러한 성능 차이를 직관적으로 보여줍니다." + ] + }, + { + "cell_type": "code", + "execution_count": 147, + "id": "dcba6c3d", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "evaluator = fr.evaluate.UpliftEvaluator(\n", + " metric_cols=[\"spend\", \"conversion\"],\n", + " is_treated_col=\"treatment\",\n", + " treatment_propensity_col=\"treatment_propensity\",\n", + " effect_type=fr.EffectType.ATE\n", + ")\n", + "\n", + "models = {\n", + " \"spend_t_learner_score\": \"Spend T-Learner\",\n", + " \"conversion_t_learner_score\": \"Conversion T-Learner\",\n", + " \"spend_t_learner_score_distill\": \"Spend T-Learner\",\n", + " \"conversion_t_learner_score_distill\": \"Conversion T-Learner\",\n", + "}\n", + "\n", + "results = evaluator.evaluate(criteo.test_data, score_cols=list(models.keys()))\n", + "\n", + "fig, axs = plt.subplots(ncols=2, figsize=(12, 4), constrained_layout=True)\n", + "\n", + "plot_cumulative_incrementality(\n", + " axs[0],\n", + " results,\n", + " title=\"Incremental Conversions vs Share Targeted\",\n", + " model_names=models,\n", + " x_col=\"share_targeted\",\n", + " y_col=\"conversion__inc_cum\",\n", + " x_label=\"Share of Users Targeted\",\n", + " y_label=\"Incremental Conversions\",\n", + " x_format=\"{:.0%}\",\n", + " y_format=\"{:,.0f}\",\n", + " show_legend=False,\n", + " x_lim=[0, 1],\n", + " y_lim=[0, None]\n", + ")\n", + "\n", + "plot_cumulative_incrementality(\n", + " axs[1],\n", + " results,\n", + " title=\"Incremental Revenue vs Share Targeted\",\n", + " model_names=models,\n", + " x_col=\"share_targeted\",\n", + " y_col=\"spend__inc_cum\",\n", + " x_label=\"Share of Users Targeted\",\n", + " y_label=\"Incremental Revenue\",\n", + " x_format=\"{:.0%}\",\n", + " y_format=\"${:,.0f}\",\n", + " x_lim=[0, 1],\n", + " y_lim=[0, None]\n", + ")\n", + "\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "id": "ed581503", + "metadata": {}, + "source": [ + "uplift model들이 제대로 작동하고 있는 것으로 보입니다. uplift curve가 대부분 random baseline(검은 선) 위에 위치해 있는데, 이는 사용자를 random으로 선택하는 것보다 더 나은 성과를 내고 있음을 의미합니다.\n", + "\n", + "두 모델 모두 incremental conversions를 최적화하는 데에는 무난한 성능을 보이지만, incremental revenue를 찾아내는 데에서는 spend T-Learner가 확실히 더 우수한 성능을 보입니다.\n", + "\n", + "이는 conversion T-Learner가 spend에 대한 정보를 전혀 알지 못하기 때문에 충분히 예상 가능한 결과입니다.\n", + "\n", + "하지만 과연 이 방법이 uplift model을 평가하는 올바른 방법일까요? 일반적으로는 그렇지 않습니다. 사용자 비율을 기준으로 타겟팅하는 것이 동일한 비용 비율을 의미하지는 않기 때문입니다. 예컨대, 모델이 선택한 사용자들이 오히려 가장 cost가 많이 드는 사용자들일 수도 있습니다.\n", + "\n", + "다음 섹션에서는 보다 현실적인 마케팅 목표를 기준으로 uplift model을 평가해 보고, 이러한 환경에서 T-Learner들이 적절한 fractional uplift model과 비교했을 때 얼마나 잘 동작하는지도 함께 살펴보겠습니다." + ] + }, + { + "cell_type": "markdown", + "id": "353e5314", + "metadata": {}, + "source": [ + "## Fractional uplift modelling" + ] + }, + { + "cell_type": "markdown", + "id": "3000ba3e", + "metadata": {}, + "source": [ + "### Objective 1: Minimum Cost per Incremental Acquisition (CPiA)\n", + "\n", + "첫 번째로 살펴볼 마케팅 목표는 incremental acquisition당 비용(cost per incremental acquisition, CPiA)을 최소화하는 것입니다. 여기서 acquisition은 conversion을 의미합니다.\n", + "\n", + "즉, 우리가 얻는 incremental conversion 하나당 가능한 한 가장 적은 비용을 지출하는 것이 목표입니다. \n", + "\n", + "CPiA는 다음과 같이 정의됩니다.\n", + "\n", + "$$ \\text{CPiA} = \\frac{\\text{Cost}_{T=1}}{N_{\\text{convert}, \\, T=1} - N_{\\text{convert}, \\, T=0}} $$\n" + ] + }, + { + "cell_type": "markdown", + "id": "6377e331", + "metadata": {}, + "source": [ + "#### Fractional uplift model\n", + "\n", + "CPiA를 최적화하기 위해 fractional uplift model에서 다음과 같은 metric 설정을 사용합니다.\n", + "\n", + "- Maximize KPI ($\\alpha$) = Conversion \n", + "- Constraint KPI ($\\beta$) = Cost \n", + "- Constraint Offset KPI ($\\gamma$) = *사용하지 않음*\n", + "\n", + "아래에서는 이러한 설정을 바탕으로 fractional uplift model을 학습합니다." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f516ac56", + "metadata": {}, + "outputs": [], + "source": [ + "train_data = fr.datasets.PandasTrainData(\n", + " features_data=criteo.train_data[criteo.features],\n", + " maximize_kpi=criteo.train_data[\"conversion\"].values,\n", + " constraint_kpi=criteo.train_data[\"cost\"].values,\n", + " is_treated=criteo.train_data[\"treatment\"].values,\n", + " treatment_propensity=criteo.train_data[\"treatment_propensity\"].values,\n", + " sample_weight=criteo.train_data[\"sample_weight\"].values,\n", + " shuffle_seed=1234\n", + ")\n", + "fractional_t_learner = fr.meta_learners.FractionalLearner(get_base_regressor())\n", + "fractional_t_learner.fit(train_data)\n", + "\n", + "distill_fractional_t_learner = get_base_regressor()\n", + "fractional_t_learner.distill(distill_dataset, distill_fractional_t_learner)\n", + "\n", + "criteo.test_data[f\"cpia_score\"] = fractional_t_learner.predict(test_dataset)\n", + "criteo.test_data[f\"cpia_score_distill\"] = distill_fractional_t_learner.predict(test_dataset)" + ] + }, + { + "cell_type": "code", + "execution_count": 149, + "id": "f707b4c2", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
f0f1f2f3f4f5f6f7f8f9...cost_percentagespendcostsample_weightconversion_t_learner_scoreconversion_t_learner_score_distillspend_t_learner_scorespend_t_learner_score_distillcpia_scorecpia_score_distill
100713.22150910.0596548.471764-1.66639613.0591692.230907-14.4763629.1703243.79394639.917532...0.05109137.3868481.9101231.00.0351700.0539630.2172230.4845560.0259710.011292
108015.64377210.0596548.2328223.90766211.0295844.115453-1.2882074.8338153.85804134.180688...0.46932337.69064917.6890781.00.0157090.0153310.6613320.1968710.0484940.030863
129513.23681211.1193098.329031-2.57001511.5610501.128518-18.6596765.8146413.85565234.621266...0.00747845.2827600.3386391.0-0.233079-0.195807-12.409683-8.438211-0.363601-0.103153
167019.62257410.0596548.3052113.90766213.2538134.115453-1.2882074.8338153.80332441.176485...0.01095844.8727550.4917141.0-0.137972-0.126640-10.253737-9.813921-0.158326-0.219133
194012.61636510.0596548.6819764.67988211.0295844.1154530.2944434.8338153.86471113.190056...0.99362219.27487719.1519401.00.0023560.0011060.003596-0.0815320.1137060.011170
\n", + "

5 rows × 25 columns

\n", + "
" + ], + "text/plain": [ + " f0 f1 f2 f3 f4 f5 \\\n", + "1007 13.221509 10.059654 8.471764 -1.666396 13.059169 2.230907 \n", + "1080 15.643772 10.059654 8.232822 3.907662 11.029584 4.115453 \n", + "1295 13.236812 11.119309 8.329031 -2.570015 11.561050 1.128518 \n", + "1670 19.622574 10.059654 8.305211 3.907662 13.253813 4.115453 \n", + "1940 12.616365 10.059654 8.681976 4.679882 11.029584 4.115453 \n", + "\n", + " f6 f7 f8 f9 ... cost_percentage \\\n", + "1007 -14.476362 9.170324 3.793946 39.917532 ... 0.051091 \n", + "1080 -1.288207 4.833815 3.858041 34.180688 ... 0.469323 \n", + "1295 -18.659676 5.814641 3.855652 34.621266 ... 0.007478 \n", + "1670 -1.288207 4.833815 3.803324 41.176485 ... 0.010958 \n", + "1940 0.294443 4.833815 3.864711 13.190056 ... 0.993622 \n", + "\n", + " spend cost sample_weight conversion_t_learner_score \\\n", + "1007 37.386848 1.910123 1.0 0.035170 \n", + "1080 37.690649 17.689078 1.0 0.015709 \n", + "1295 45.282760 0.338639 1.0 -0.233079 \n", + "1670 44.872755 0.491714 1.0 -0.137972 \n", + "1940 19.274877 19.151940 1.0 0.002356 \n", + "\n", + " conversion_t_learner_score_distill spend_t_learner_score \\\n", + "1007 0.053963 0.217223 \n", + "1080 0.015331 0.661332 \n", + "1295 -0.195807 -12.409683 \n", + "1670 -0.126640 -10.253737 \n", + "1940 0.001106 0.003596 \n", + "\n", + " spend_t_learner_score_distill cpia_score cpia_score_distill \n", + "1007 0.484556 0.025971 0.011292 \n", + "1080 0.196871 0.048494 0.030863 \n", + "1295 -8.438211 -0.363601 -0.103153 \n", + "1670 -9.813921 -0.158326 -0.219133 \n", + "1940 -0.081532 0.113706 0.011170 \n", + "\n", + "[5 rows x 25 columns]" + ] + }, + "execution_count": 149, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "criteo.test_data.head()" + ] + }, + { + "cell_type": "markdown", + "id": "b51f3134", + "metadata": {}, + "source": [ + "#### Evaluation\n", + "\n", + "가장 낮은 CPiA는 가장 낮은 비용으로 가장 많은 incremental conversions를 달성하는 경우에 해당합니다. 이를 다음 두 가지 방식으로 평가합니다.\n", + "\n", + "1. 총 비용(total cost)에 따른 incremental conversions를 시각화합니다. 동일한 비용에서 incremental conversions가 가장 높은 모델이 가장 우수한 모델입니다.\n", + "2. incremental conversions에 따른 CPiA를 시각화합니다. 동일한 incremental conversions에서 CPiA가 가장 낮은 모델이 가장 우수한 모델입니다.\n", + "\n", + "이러한 지표들은 아래의 UpliftEvaluator를 사용하여 계산합니다.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 150, + "id": "46fddcae", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "class CPIAUpliftEvaluator(fr.evaluate.UpliftEvaluator):\n", + " def __init__(self, **kwargs):\n", + " kwargs[\"metric_cols\"] = [\"spend\", \"conversion\", \"cost\"]\n", + " super().__init__(**kwargs)\n", + "\n", + " def _calculate_composite_metrics(self, data: pd.DataFrame) -> pd.DataFrame:\n", + " data[\"cpia__inc_cum\"] = data[\"cost__inc_cum\"] / data[\"conversion__inc_cum\"]\n", + " data[\"cpia__inc\"] = data[\"cost__inc\"] / data[\"conversion__inc\"]\n", + " return data\n", + "\n", + "evaluator = CPIAUpliftEvaluator(\n", + " is_treated_col=\"treatment\",\n", + " treatment_propensity_col=\"treatment_propensity\",\n", + " effect_type=fr.EffectType.ATE\n", + ")\n", + "\n", + "models = {\n", + " \"spend_t_learner_score\": \"Spend T-Learner\",\n", + " \"conversion_t_learner_score\": \"Conversion T-Learner\",\n", + " \"cpia_score\": \"CPiA Fractional T-Learner\",\n", + " \"cpia_score_distill\": \"CPiA Fractional T-Learner\",\n", + "}\n", + "\n", + "results = evaluator.evaluate(criteo.test_data, score_cols=list(models.keys()))\n", + "\n", + "fig, axs = plt.subplots(ncols=2, figsize=(12, 4), constrained_layout=True)\n", + "\n", + "plot_cumulative_incrementality(\n", + " axs[0],\n", + " results,\n", + " title=\"Incremental Conversions vs Cost\",\n", + " model_names=models,\n", + " x_col=\"cost__inc_cum\",\n", + " y_col=\"conversion__inc_cum\",\n", + " x_label=\"Cost\",\n", + " y_label=\"Incremental Conversions\",\n", + " x_format=\"${:,.0f}\",\n", + " y_format=\"{:,.0f}\",\n", + " x_lim=[0, None],\n", + " y_lim=[0, None],\n", + " show_legend=False\n", + ")\n", + "\n", + "plot_cumulative_incrementality(\n", + " axs[1],\n", + " results,\n", + " title=\"Cost per Incremental Acquisition (CPiA)\",\n", + " model_names=models,\n", + " x_col=\"conversion__inc_cum\",\n", + " y_col=\"cpia__inc_cum\",\n", + " x_label=\"Incremental Conversions\",\n", + " y_label=\"CPiA\",\n", + " x_format=\"{:,.0f}\",\n", + " y_format=\"${:,.0f}\",\n", + " x_lim=[0, None]\n", + ")\n", + "\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "id": "0ac57c64", + "metadata": {}, + "source": [ + "fractional uplift 모델은 기존 uplift 모델보다 명확하게 더 우수한 성과를 보입니다. 동일한 비용 대비 훨씬 낮은 CPiA를 달성하고, 더 많은 추가 전환(incremental conversions)을 만들어냅니다. \n", + "\n", + "반면 T-Learner는 일부 경우 오히려 CPiA를 증가시키는 모습을 보입니다. 이는 T-Learner가 비용을 고려하지 않기 때문에, 증분 효과는 크지만 비용이 높은 사용자들을 타겟팅하게 되고, 그 결과 전체적으로는 더 나쁜 CPiA를 초래하기 때문입니다.\n" + ] + }, + { + "cell_type": "markdown", + "id": "75dda433", + "metadata": {}, + "source": [ + "### Objective 2: Maximum iRoI\n", + "\n", + "이번에는 CPiA가 아니라 수익(revenue) 관점에서 살펴보겠습니다. 모든 전환이 동일한 가치를 갖는 것은 아니며, 어떤 전환은 다른 전환보다 훨씬 더 큰 매출을 만들어냅니다.\n", + "\n", + "따라서 가능한 한 낮은 비용으로 최대의 추가 매출(incremental revenue, spend) 을 창출하고자 한다면, 우리의 목표는 iRoI를 최대화하는 것이 됩니다.\n", + "\n", + "$$\n", + "\\text{iRoI} = \\frac{\\text{Spend}_\\text{T=1} - \\text{Spend}_\\text{T=0}}{\\text{Cost}_\\text{T=1}}\n", + "$$\n" + ] + }, + { + "cell_type": "markdown", + "id": "10405922", + "metadata": {}, + "source": [ + "#### Fractional uplift model\n", + "\n", + "iRoI를 최적화하기 위해 fractional uplift 모델에서 다음과 같이 지표를 설정합니다.\n", + "\n", + "- Maximize KPI ($\\alpha$) = Spend \n", + "- Constraint KPI ($\\beta$) = Cost \n", + "- Constraint Offset KPI ($\\gamma$) = *사용하지 않음*\n", + "\n", + "아래에서는 이러한 설정을 바탕으로 fractional uplift 모델을 학습합니다." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5c23b3cd", + "metadata": {}, + "outputs": [], + "source": [ + "train_data = fr.datasets.PandasTrainData(\n", + " features_data=criteo.train_data[criteo.features],\n", + " maximize_kpi=criteo.train_data[\"spend\"].values,\n", + " constraint_kpi=criteo.train_data[\"cost\"].values,\n", + " is_treated=criteo.train_data[\"treatment\"].values,\n", + " treatment_propensity=criteo.train_data[\"treatment_propensity\"].values,\n", + " sample_weight=criteo.train_data[\"sample_weight\"].values,\n", + " shuffle_seed=1234\n", + ")\n", + "fractional_t_learner = fr.meta_learners.FractionalLearner(get_base_regressor())\n", + "fractional_t_learner.fit(train_data)\n", + "\n", + "distill_fractional_t_learner = get_base_regressor()\n", + "fractional_t_learner.distill(distill_dataset, distill_fractional_t_learner)\n", + "\n", + "criteo.test_data[f\"roi_score\"] = fractional_t_learner.predict(test_dataset)\n", + "criteo.test_data[f\"roi_score_distill\"] = distill_fractional_t_learner.predict(test_dataset)" + ] + }, + { + "cell_type": "markdown", + "id": "8d235815", + "metadata": {}, + "source": [ + "#### Evaluation\n", + "\n", + "가장 높은 iRoI는 가장 낮은 비용으로 가장 많은 추가 매출(incremental revenue) 을 달성한 경우에 해당합니다. 이를 다음 두 가지 방식으로 평가합니다.\n", + "\n", + "1. 총 비용 대비 추가 매출을 시각화합니다. 동일한 비용에서 추가 매출이 가장 높은 모델이 가장 우수한 모델입니다.\n", + "2. 추가 매출 대비 iRoI를 시각화합니다. 동일한 추가 매출 수준에서 iRoI가 가장 높은 모델이 가장 우수한 모델입니다.\n", + "\n", + "이러한 지표들은 아래의 UpliftEvaluator를 사용해 계산합니다." + ] + }, + { + "cell_type": "code", + "execution_count": 152, + "id": "0ef46b07", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "class RoIUpliftEvaluator(fr.evaluate.UpliftEvaluator):\n", + " def __init__(self, **kwargs):\n", + " kwargs[\"metric_cols\"] = [\"spend\", \"conversion\", \"cost\"]\n", + " super().__init__(**kwargs)\n", + "\n", + " def _calculate_composite_metrics(self, data: pd.DataFrame) -> pd.DataFrame:\n", + " data[\"roi__inc_cum\"] = data[\"spend__inc_cum\"] / data[\"cost__inc_cum\"]\n", + " data[\"roi__inc\"] = data[\"spend__inc\"] / data[\"cost__inc\"]\n", + " return data\n", + "\n", + "evaluator = RoIUpliftEvaluator(\n", + " is_treated_col=\"treatment\",\n", + " treatment_propensity_col=\"treatment_propensity\",\n", + " effect_type=fr.EffectType.ATE\n", + ")\n", + "\n", + "models = {\n", + " \"spend_t_learner_score\": \"Spend T-Learner\",\n", + " \"conversion_t_learner_score\": \"Conversion T-Learner\",\n", + " \"roi_score\": \"RoI Fractional T-Learner\",\n", + " \"roi_score_distill\": \"RoI Fractional T-Learner\",\n", + "}\n", + "\n", + "results = evaluator.evaluate(criteo.test_data, score_cols=list(models.keys()))\n", + "\n", + "fig, axs = plt.subplots(ncols=2, figsize=(12, 4), constrained_layout=True)\n", + "\n", + "plot_cumulative_incrementality(\n", + " axs[0],\n", + " results,\n", + " title=\"Incremental Revenue vs Cost\",\n", + " model_names=models,\n", + " x_col=\"cost__inc_cum\",\n", + " y_col=\"spend__inc_cum\",\n", + " x_label=\"Cost\",\n", + " y_label=\"Incremental Revenue\",\n", + " x_format=\"${:,.0f}\",\n", + " y_format=\"${:,.0f}\",\n", + " show_legend=False,\n", + " x_lim=[0, None],\n", + " y_lim=[0, None],\n", + ")\n", + "\n", + "plot_cumulative_incrementality(\n", + " axs[1],\n", + " results,\n", + " title=\"iRoI vs Incremental Revenue\",\n", + " model_names=models,\n", + " x_col=\"spend__inc_cum\",\n", + " y_col=\"roi__inc_cum\",\n", + " x_label=\"Incremental Revenue\",\n", + " y_label=\"iRoI\",\n", + " x_format=\"${:,.0f}\",\n", + " y_format=\"{:.1f}\",\n", + " x_lim=[0, None],\n", + " y_lim=[0, 5]\n", + ")\n", + "\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "id": "eca57559", + "metadata": {}, + "source": [ + "이번 경우에서도 fractional uplift 모델이 기존 uplift 모델보다 더 우수한 성과를 보입니다. 일반적인 spend T-Learner 꽤 괜찮은 성능을 내지만, RoI를 최적화한 fractional learner에는 미치지 못합니다. \n", + "\n", + "또한 이 경우에는 distilled 버전이 전체(full) 모델의 성능을 충분히 잘 따라가지 못하는 것으로 보이며, 추가적인 튜닝이 필요함을 시사합니다." + ] + }, + { + "cell_type": "markdown", + "id": "a7289d07", + "metadata": {}, + "source": [ + "### Objective 3: Maximum Incremental Conversions with an RoI Constraint\n", + "\n", + "마지막으로 살펴볼 경우는 iRoI가 특정 목표 값 이하로 떨어지지 않도록 유지하면서, 가능한 한 많은 추가 전환(incremental conversions) 을 만들어내는 상황입니다.\n", + "\n", + "이 예제에서는 iRoI 목표값을 2.0으로 설정합니다." + ] + }, + { + "cell_type": "markdown", + "id": "8c6e744f", + "metadata": {}, + "source": [ + "#### Fractional uplift model\n", + "\n", + "이 문제는 constraint가 포함되어 있어 조금 더 복잡하지만, fractional uplift 설정을 통해 해결할 수 있습니다. 이를 위해 다음과 같이 설정합니다.\n", + "\n", + "* Maximize KPI ($\\alpha$) = Conversion \n", + "* Constraint KPI ($\\beta$) = Cost \n", + "* Constraint Offset KPI ($\\gamma$) = Spend \n", + "* Constraint Offset Scale ($\\delta$) = iRoI target \n", + "\n", + "이 설정은 uplift 모델이 다음과 같은 값을 추정하도록 만듭니다.\n", + "\n", + "$$\n", + "f_\\delta(X)=\n", + "\\begin{cases}\n", + " \\frac{N_{\\text{convert}, \\, T=1} - N_{\\text{convert}, \\, T=0}}{\\text{Cost}_{T=1} - \\frac{\\text{Spend}_\\text{T=1} - \\text{Spend}_\\text{T=0}}{\\text{iRoI}_\\text{target}}},& \\text{Cost}_{T=1} > \\frac{\\text{Spend}_\\text{T=1} - \\text{Spend}_\\text{T=0}}{\\text{iRoI}_\\text{target}}\\\\\n", + " \\infty, & \\text{otherwise}\n", + "\\end{cases}\n", + "$$\n", + "\n", + "이를 직관적으로 보면 다음과 같이 해석할 수 있습니다.\n", + "\n", + "1. 사용자의 iRoI가 iRoI target보다 높다면, 항상 해당 사용자를 타겟팅합니다.\n", + "2. iRoI가 target보다 낮은 경우에는, \n", + " incremental conversions을 얼마나 만들어내는지를 \n", + " net cost으로 나눈 값으로 사용자를 정렬합니다. \n", + " 여기서 순비용은 iRoI target을 초과해서 발생하는 비용을 의미합니다. \n", + " 즉, iRoI target을 가장 적게 훼손하면서 가장 많은 incremental conversions를 만드는 사용자부터 타겟팅합니다.\n", + "\n", + "아래에서는 이를 실제로 구현합니다." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "263b9561", + "metadata": {}, + "outputs": [], + "source": [ + "target_roi = 2.0\n", + "\n", + "train_data = fr.datasets.PandasTrainData(\n", + " features_data=criteo.train_data[criteo.features],\n", + " maximize_kpi=criteo.train_data[\"conversion\"].values,\n", + " constraint_kpi=criteo.train_data[\"cost\"].values,\n", + " constraint_offset_kpi=criteo.train_data[\"spend\"].values,\n", + " is_treated=criteo.train_data[\"treatment\"].values,\n", + " treatment_propensity=criteo.train_data[\"treatment_propensity\"].values,\n", + " sample_weight=criteo.train_data[\"sample_weight\"].values,\n", + " shuffle_seed=1234\n", + ")\n", + "fractional_t_learner = fr.meta_learners.FractionalLearner(get_base_regressor())\n", + "fractional_t_learner.fit(train_data)\n", + "\n", + "distill_fractional_t_learner = get_base_regressor()\n", + "fractional_t_learner.distill(distill_dataset, distill_fractional_t_learner, constraint_offset_scale=target_roi)\n", + "\n", + "criteo.test_data[f\"roi_constrained_conversion_score\"] = fractional_t_learner.predict(\n", + " test_dataset, constraint_offset_scale=target_roi\n", + ")\n", + "criteo.test_data[f\"roi_constrained_conversion_score_distill\"] = distill_fractional_t_learner.predict(test_dataset)" + ] + }, + { + "cell_type": "code", + "execution_count": 154, + "id": "2a9a778f", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
f0f1f2f3f4f5f6f7f8f9...conversion_t_learner_scoreconversion_t_learner_score_distillspend_t_learner_scorespend_t_learner_score_distillcpia_scorecpia_score_distillroi_scoreroi_score_distillroi_constrained_conversion_scoreroi_constrained_conversion_score_distill
100713.22150910.0596548.471764-1.66639613.0591692.230907-14.4763629.1703243.79394639.917532...0.0351700.0539630.2172230.4845560.0259710.0112920.160407-1.1649810.02823530.174107
108015.64377210.0596548.2328223.90766211.0295844.115453-1.2882074.8338153.85804134.180688...0.0157090.0153310.6613320.1968710.0484940.0308632.0415461.063659inf30.259451
129513.23681211.1193098.329031-2.57001511.5610501.128518-18.6596765.8146413.85565234.621266...-0.233079-0.195807-12.409683-8.438211-0.363601-0.103153-19.358949-1.164981-0.03404710.612205
167019.62257410.0596548.3052113.90766213.2538134.115453-1.2882074.8338153.80332441.176485...-0.137972-0.126640-10.253737-9.813921-0.158326-0.219133-11.766439-1.164981-0.0230020.024123
194012.61636510.0596548.6819764.67988211.0295844.1154530.2944434.8338153.86471113.190056...0.0023560.0011060.003596-0.0815320.1137060.0111700.1735770.9323360.1245123.947203
\n", + "

5 rows × 29 columns

\n", + "
" + ], + "text/plain": [ + " f0 f1 f2 f3 f4 f5 \\\n", + "1007 13.221509 10.059654 8.471764 -1.666396 13.059169 2.230907 \n", + "1080 15.643772 10.059654 8.232822 3.907662 11.029584 4.115453 \n", + "1295 13.236812 11.119309 8.329031 -2.570015 11.561050 1.128518 \n", + "1670 19.622574 10.059654 8.305211 3.907662 13.253813 4.115453 \n", + "1940 12.616365 10.059654 8.681976 4.679882 11.029584 4.115453 \n", + "\n", + " f6 f7 f8 f9 ... \\\n", + "1007 -14.476362 9.170324 3.793946 39.917532 ... \n", + "1080 -1.288207 4.833815 3.858041 34.180688 ... \n", + "1295 -18.659676 5.814641 3.855652 34.621266 ... \n", + "1670 -1.288207 4.833815 3.803324 41.176485 ... \n", + "1940 0.294443 4.833815 3.864711 13.190056 ... \n", + "\n", + " conversion_t_learner_score conversion_t_learner_score_distill \\\n", + "1007 0.035170 0.053963 \n", + "1080 0.015709 0.015331 \n", + "1295 -0.233079 -0.195807 \n", + "1670 -0.137972 -0.126640 \n", + "1940 0.002356 0.001106 \n", + "\n", + " spend_t_learner_score spend_t_learner_score_distill cpia_score \\\n", + "1007 0.217223 0.484556 0.025971 \n", + "1080 0.661332 0.196871 0.048494 \n", + "1295 -12.409683 -8.438211 -0.363601 \n", + "1670 -10.253737 -9.813921 -0.158326 \n", + "1940 0.003596 -0.081532 0.113706 \n", + "\n", + " cpia_score_distill roi_score roi_score_distill \\\n", + "1007 0.011292 0.160407 -1.164981 \n", + "1080 0.030863 2.041546 1.063659 \n", + "1295 -0.103153 -19.358949 -1.164981 \n", + "1670 -0.219133 -11.766439 -1.164981 \n", + "1940 0.011170 0.173577 0.932336 \n", + "\n", + " roi_constrained_conversion_score \\\n", + "1007 0.028235 \n", + "1080 inf \n", + "1295 -0.034047 \n", + "1670 -0.023002 \n", + "1940 0.124512 \n", + "\n", + " roi_constrained_conversion_score_distill \n", + "1007 30.174107 \n", + "1080 30.259451 \n", + "1295 10.612205 \n", + "1670 0.024123 \n", + "1940 3.947203 \n", + "\n", + "[5 rows x 29 columns]" + ] + }, + "execution_count": 154, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "criteo.test_data.head()" + ] + }, + { + "cell_type": "markdown", + "id": "c72e932f", + "metadata": {}, + "source": [ + "#### Evaluation\n", + "\n", + "이를 평가하기 위해, iRoI가 목표 iRoI 값인 2.0과 같아지는 지점에서 달성할 수 있는 최대 전환 수를 확인합니다. \n", + "\n", + "아래에서는 다시 한 번 uplift evaluator를 사용해, incremental conversions에 따른 iRoI의 변화를 그래프로 시각화합니다." + ] + }, + { + "cell_type": "code", + "execution_count": 155, + "id": "a40a2611", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "evaluator = RoIUpliftEvaluator(\n", + " is_treated_col=\"treatment\",\n", + " treatment_propensity_col=\"treatment_propensity\",\n", + " effect_type=fr.EffectType.ATE\n", + ")\n", + "\n", + "models = {\n", + " \"spend_t_learner_score\": \"Spend T-Learner\",\n", + " \"conversion_t_learner_score\": \"Conversion T-Learner\",\n", + " \"roi_constrained_conversion_score\": \"RoI Constrained Fractional T-Learner\",\n", + " \"roi_constrained_conversion_score_distill\": \"RoI Constrained Fractional T-Learner\",\n", + "}\n", + "\n", + "results = evaluator.evaluate(criteo.test_data, score_cols=list(models.keys()))\n", + "\n", + "fig, ax = plt.subplots(figsize=(10, 4), constrained_layout=True)\n", + "\n", + "roi_plot = plot_cumulative_incrementality(\n", + " ax,\n", + " results,\n", + " title=\"RoI vs Incremental Conversion\",\n", + " model_names=models,\n", + " x_col=\"conversion__inc_cum\",\n", + " y_col=\"roi__inc_cum\",\n", + " x_label=\"Incremental Conversions\",\n", + " y_label=\"iRoI\",\n", + " x_format=\"{:,.0f}\",\n", + " y_format=\"{:.1f}\",\n", + " x_lim=[0, None],\n", + " y_lim=[0, 5]\n", + ")\n", + "\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "id": "d9e251b3", + "metadata": {}, + "source": [ + "fractional learner가 단순한 T-learner들보다 우수한 성과를 보인다는 점이 분명하게 드러납니다.\n", + "fractional learner는 iRoI가 2.0인 조건에서 약 3,000건의 incremental conversions을 달성합니다.\n", + "\n", + "반면, conversion T-learner는 iRoI 2.0에 도달하지 못하며,\n", + "Spend T-learner 역시 iRoI가 2.0일 때 약 1,000건의 incremental conversions만을 만들어낼 수 있습니다." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.14" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}