This project uses data from the UCI Machine Learning Repository on a room's environmental measures to build an occupancy detection algorithm. The data can be found and downloaded here:
Our goal is to build a model that predicts, as accurately as possible, whether a room is occupied based on various data readings (e.g., temperature, humidity, etc.). Specifically, we aim to build a model that detects occupancy with maximal overall accuracy.
This repo contains 3 files:
| File | Description |
|---|---|
| occupancy.R | The R code used to process the data and build our predictive models |
| occupancy.Rmd | The R Markdown file, which produces the full PDF report with narrative |
| occupancy.pdf | The final report, knitted from the RMD syntax |
Problem Description:
The data that we use to build and test our model contains approximately 20,500 observations taken over the course of about 2.5 weeks in February, 2015. Our data has 1 target variable (occupancy) and 6 predictors (date, temperature, humidity, light, co2, humidity_ratio).
Occupancy is a binary field that indicates whether the room is occupied at the time of observation. Each row in our data represents an observation made at a given time (date); observations are typically made in 1-minute increments. Each observation includes data readings from electronic sensors reporting information on the temperature, humidity, light, and CO2 levels. Ground truth occupancy was determined by time-stamped photos taken alongside the sensor readings.