Users can access and download every dataset from the following Github repository: github.com/ajcoach/DATA422-Group-Project
This GitHub repository contains the entire contents of group TBA's project for DATA422.
The contents of this repository are as follows:
-
README.md
- Repository description, documents file descriptions and file execution order
-
Jupyter notebooks (recommended to execute in the order listed, however Suburbs_Julia.ipynb must be run before Crime_Suburbs_R.ipynb)
- Income_employment_age_data_R.ipynb
- income_data_Julia.ipynb
- Suburbs_Julia.ipynb
- Crime_Suburbs_R.ipynb
- Meshblock - Area data.ipynb
- School data.ipynb
- spatialIndex.ipynb
- weatherData.ipynb
-
Project report.pdf
- PDF report describing our project
-
Project presentation.pdf
- PDF presentation slides
-
Project diary.pdf
- PDF outlining what everyone has done
-
Datasets input (folder)
- Folder containing datasets from various public sources
- See report references for respective owners
-
Output datasets in root folder
- area.csv
- canterbury_employment.csv
- canterbury_gender.csv
- canterbury_income.csv
- crimes_population_data.csv
- meshblock.csv
- schools.csv
- suburbs.csv
-
spatialData (folder with output data from the spatialData)
- spatialIndexFull.feather
- spatialIndexTrimmed.csv
- spatialIndexTrimmed.feather
- weatherFull-canterbury.feather
- weatherTrimmed-canterbury.csv
-
Data Model.png
- Image of final data model
All the output datasets create a final data model that looks like this:

