berkeleyflow · kjang96 · Aug 16, 2018
@@ -0,0 +1,151 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Exercise 10: Running RLlib experiments on EC2"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This tutorial walks through how to run RLlib experiments on an AWS EC2 instance. This assumes that the machine you are using has already been configured for AWS (i.e. `~/.aws/credentials` is properly set up). For more detailed documentation, please view: https://github.com/ray-project/ray/blob/master/doc/source/autoscaling.rst"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Getting Dependencies \n",
+    "\n",
+    "\n",
+    "* First, make sure your version of ray is tracking http://github.com/eugenevinitsky/ray. To do this, go to your ray directory and run `git remove -v` and confirm that the branch you are trackng matches \"eugenevinitsky/ray\"\n",
+    "\n",
+    "\n",
+    "* Install the `rayutils` package from https://github.com/richardliaw/rayutils:  \n",
+    "\n",
+    "`pip install -e git+https://github.com/richardliaw/rayutils.git#egg=rayutils`\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Modify Configuration\n",
+    "\n",
+    "This section explains the components of `/learning-traffic/scripts/ray_autoscale.yaml` you'll want to customize. These descriptions are also listed in the `ray_autoscale.yaml`. We'll go over some of the variables you should change, as well as those that might come in handy for you:\n",
+    "\n",
+    "* Modify `cluster_name`: A unique identifier for the head node and workers of this cluster. If you want to set up multiple clusters, `cluster_name` must be changed each time the script is run.\n",
+    "\n",
+    "    \n",
+    "* Modify `file_mounts`: _change me!_ You'll want to change these file mounts. \n",
+    "    * \"tmp/path\" indicates the path to the version of Flow you intend to use. This is specified in the format`#\"/tmp/path\": \"<PATH TO LEARNING TRAFFIC>/.git/refs/heads/<BRANCH NAME>\"`\n",
+    "    * \"tmp/ray_autoscaler_key\" is the path to the ray autoscaler key. For most, this will be found in ~/.ssh"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Setup Clusters\n",
+    "\n",
+    "* To create the cluster, run: `ray create_or_update ray_autoscale.yaml -y`\n",
+    "\n",
+    "* To set up ray in the cluster, run:  `ray2 setup ray_autoscale.yaml`\n",
+    "\n",
+    "After step 5 is complete, you can login to the cluster via:  `$(ray2 login_cmd ray_autoscale.yaml)`. Note that you can run commands from outside the cluster via: `ray2 submit ray_autoscale.yaml [--background] [--shutdown] test.py`, where test.py is an example script."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Run Experiments\n",
+    "\n",
+    "The cluster is all set up and you are ready to run an experiment! From `/learning-traffic/scripts`, run:  \n",
+    "\n",
+    "`./run_rllib.sh -f /Users/kathyjang/research/rllab-multiagent/learning-traffic/examples/rllib/figure_eight.py -s`\n",
+    "\n",
+    "--- \n",
+    "### Results and Caveats\n",
+    "The experiment is now being run! Results are by default logged in ~/ray_results\n",
+    "\n",
+    "The `run_rllib.sh` script can be run with a few different flags:  \n",
+    "* -f is required, indicates the script to be ran on the cluster\n",
+    "* -s instructs the cluster to shutdown after the script is done running\n",
+    "* -b runs the script in the background (recommended for long experiments)\n",
+    "* -n TODO: This is listed as an option in `run_rllib.sh` but there's no if clause supporting it, nor do I see an analog in https://github.com/richardliaw/rayutils/blob/master/rayutils/rayutils.py Whoever wrote this can you chime in? \n",
+    "\n",
+    "For background users: RLlib uses `screen`, a Linux utility for managing processes in order to run scripts in the background. This means your experiment is running in a \"screen\" separate to the main screen you can interface with. If you ran an experiment with the -b flag, here's how to check up on the progress of your experiment. Login to the cluster and enter `screen -r` in order to reattach the other screen. Once reattached you should immediately be able to see the stdout string of your running experiment. To detach from this screen, hit `Ctrl-d` to signal for commands to be sent to screen rather than than the shell, then hit `a`. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Close Clusters\n",
+    "\n",
+    "If you didn't run `./run_rllib.sh` with the -s option, then you will need to shutdown the cluster manually. To do this, log on to the cluster and run:  \n",
+    "\n",
+    "`ray2 shutdown`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Troubleshooting\n",
+    "\n",
+    "\n",
+    "- NOTE: If pyarrow is an issue or Ray is being an issue, this is what I did. basically you have to completely get rid of ray and reinstall it again\n",
+    "  - source activate [your_env]\n",
+    "  - `rm -rf ray`\n",
+    "  - repeat the following until `which ray` returns blank:\n",
+    "    - `which ray`\n",
+    "    - `rm [the output of which ray]`\n",
+    "    - this gets rid of the binary installed. Idk if this is necessary but I did it. after this step it’s as if ray never existed on your system\n",
+    "  - Go to directory you want to install ray and run: `git clone https://github.com/eugenevinitsky/ray.git`\n",
+    "  - `cd ray/python`\n",
+    "  - `python setup.py develop`\n",
+    "  \n",
+    "  \n",
+    "  \n",
+    "* pip install for rayutils doesn't work\n",
+    "    * this is being run in \"editable\" mode\n",
+    "    * NOTE: Richard Liaw's Git README suggests that you run the following command: `pip install git+https://github.com/richardliaw/rayutils.git`. I suspect there's something wrong with the repo structure, because rayutils is nowhere to be found after pip returns a successful installation. My edit does the pip installation in \"edit\" mode, and the #rayutils at the end of the command denotes the name of the package\n",
+    "    TODO: Could someone confirm that ^ doesn't work?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}