Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''

---

**Describe the bug**
A clear and concise description of what the bug is.

**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

**Expected behavior**
A clear and concise description of what you expected to happen.

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Desktop (please complete the following information):**
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]

**Smartphone (please complete the following information):**
- Device: [e.g. iPhone6]
- OS: [e.g. iOS8.1]
- Browser [e.g. stock browser, safari]
- Version [e.g. 22]

**Additional context**
Add any other context about the problem here.
20 changes: 20 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''

---

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.
78 changes: 69 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,85 @@
# PA-Cloud

Terraform that builds the pa-cloud. Currently this consists of propass related infrastructure.
![Pacific Analytics Logo](docs/images/PA_logo.png)

### Getting started
PA-Cloud is a solution for enabling cross-institutional federated analysis of informatics datasets while maintaining compliance with institutional network policies, and leveraging DataSHIELD to prevent direct disclosure of data.

1. Initialise terraform
# Architectural Overview

PA-Cloud consists of a set of AWS resources alongside deployments of DataSHIELD, the latter of which is deployed onto federated networks and uses an AWS site-to-site VPN to communicate securely.

The site-to-site VPN tunnels form a confluence in AWS, within which an EC2 resides running JupyterHub. This is then used to spawn Jupyter notebook servers, within which connections to the DataSHIELD servers are established and analysis can take place. This enables cross-institutional analysis while remaining secure and compliant with institutional network policies.

Authentication is provided by an instance of Keycloak operating within the AWS internal network.

# Getting started

## Prerequisites

To deploy PA-Cloud from scratch, you will need:

- An AWS account.
- The AWS CLI installed and logged into the above account (if `aws sts get-caller-identity` succeeds when run, this is met).
- Sufficient permissions to deploy all resources.
- Hashicorp's Terraform installed.
- Access to at least one cohort server, for which:
- A server is required, which has:
- A constant externally facing IP address (this can either be the IP address of the server itself, or the IP address of the NAT gateway which it is behind),
- Docker installed and running.

> [!NOTE]
> This repository is currently configured as per Pacific Analytics' deployment of PA-Cloud. You will need to adjust various values in Terraform if deploying to a different location, though contributions towards generifying the repository or expanding the range of deployment environments it can run on are welcome.

> [!NOTE]
> The above is provisional, with exact requirements for a novel environment not fully known.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe invite people to contribute back their experiences.


## Deployment

Deployment is handled solely through Terraform.

1. Generate an SSH key. This is to be used to authenticate to on prem servers, and should be added to `authorized_keys` in all cohort servers. These instructions assume this was generated as `./pacloud.key`.

2. Add all cohort servers. This can be done in `cohort.tf` by adding further instances of the `cohort` module, ensuring that required additional networking resources are added in `network.tf` as well.

3. Initialise terraform.
```hcl
terraform init
```

2. Add pacloud ssh key to ssh keyring (can find this in bitwarden) (used to authenticate with on prem cohort servers)
4. Add `pacloud.key` to your SSH keyring:
```
ssh-add <pacloud.key>
ssh-add ./pacloud.key
```

2. Plan terraform
5. Generate a deployment plan with Terraform.
```hcl
terraform plan
terraform plan -out plan.tfplan
```

3. Apply changes
> [!WARNING]
> Ensure you review the generated plan to ensure all required resources are present and correctly configured. Always review the plan once deployed to ensure Terraform does not mistakenly perform a destructive action.

6. Apply the previously generated plan.
```hcl
terraform apply
terraform apply plan.tfplan
```

## Access

Once Terraform completes its deployment, you can access the PA-Cloud deployment via the domain set as the public Route53 zone in `domain.tf`. Assuming this domain to be `propass.pacificanalytics.com`, this is:

- `https://identity.propass.pacificanalytics.com` (Keycloak)
- `https://analytics.propass.pacificanalytics.com` (JupyterHub)

Users must be added to the created realm in Keycloak (`propass` by default) to access Jupyterhub. Keycloak's default administrator credentials can be retrieved from Terraform's state file, or AWS Secrets Manager.

## Analysis

Once a user has been created in the `propass` realm in Keycloak, they can log in to Jupyterhub and begin performing analysis using DataSHIELD.

When using DataSHIELD, connect using the names specified in the cohort modules, suffixed with `.pacloud.internal`. For example, a cohort called 'example' would set the `name` variable to `example`, then once deployed, connect to it with DataSHIELD via `example.pacloud.internal`.

# License and Contact
PA-Cloud is available under the Apache 2.0 license. See [LICENSE](./LICENSE) for more details.

For any questions relating to the code or usage thereof, raise an issue. For urgent matters or other inquiries, contact [info@pacificanalytics.com](mailto:info@pacificanalytics.com).
Binary file added docs/images/PA_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.