Google Forms data fetching, parsing, and visualizing on public web interface. Learn more about Hack4OpenGLAM (summit.creativecommons.org)...
The project consists of three main components:
- front-end
- back-end
- importer
Each of these components live in their corresponding directory.
| Component | Overview |
|---|---|
| frontend | CRA frontend app |
| backend | Exposes an REST API for frontend to call. Reads data from importer's output, filters it according to the API request's parameters, and responds it. |
| importer | Imports data from Google Spreadsheets and parses it into clean machine-readable format. The importer is run in 5 minute intervals, which means that the data is refreshed in this interval. |
The following flowchart shows the functions for each component in their order.
Parsed and formatted data is saved locally to data/data.json. It is an array of objects. Each object is a single data entry.
There can be multiple data entries in a single Google Spreadsheets row. For example, a submission could contain details of a person, but they also may have wanted to submit a project. This is how multiple entries end up in a single row.
Importer will read a single row and seperate the entries of it into the output. This is repeated for all rows of the Spreadsheet.
Most important files for the parsing process are:
- importer/src/import/parser/rawRow.js
- Maps Google Spreadsheets table column indexes into a named object
- When creating a new Google Forms, you'll may need to remap the column indexes here
- importer/src/import/parser/formatData.js
- Casts Spreadsheet data from a named object into objects that vary by the entry type. Entry types are:
- people
- collections
- projects
- workshops
- tools
- Data added here will be available through the API, unless a field is opted out
- Casts Spreadsheet data from a named object into objects that vary by the entry type. Entry types are:
For flags and additional manually or automatically added details, we use a seperate table from Google Forms's table. In this table each row index matches with a row index of the Google Forms's table.
| importedBefore | eventRole | Slug (read-only) |
|---|---|---|
| 1 | painting-a-painting | |
| 2 | alice | |
| 2 | bob |
When a row is read, the program checks from the other table whether the importedBefore is not empty. If it is empty, it will mark it with a number (1 = imported as a proposal, 2 = imported only as a person) and also save the generated slug. For each new row in the Spreadsheet (importedBefore is empty), there are additional tasks taken. The read-only slug can be used for data analysis purposes in the spreadsheet.
Newly imported rows (when importedBefore is empty) run automations. Every submission adds its creator to the Mailerlite mailing list through their API.
If a submission contained a proposal (entry type is not people), a Mattermost channel will be created through their API.
User will be also sent an acknowledgment email. The contents will vary, whether a Mattermost channel creation was needed or not (newUser / newUserAndChannel). If a Mattermost channel was created, the channel link will be added to the email.
The frontend app uses the GET /api/results API route to fetch data.
Filtering is possible using query parameters:
type- One of these: people, collections, workshops, projects, toolsownerHash- Email hash of a person. Results will be all entries submitted using this email (=person).slug- Entry slug
Multiple filters can be combined, but the same query parameter can't be repeated more than once.
The backend also serves the built React frontend at / route from the ./web/ directory.
This is why the docker-compose.yml needs to spin up only importer and backend containers.
- Clone the repository
- Install the dependencies by running
npm installin the each component directory - Each component can be run using
npm start. Remember to configure the app using the instructions below. You may need to run the components simultaneously, though you can run importer just once, if you're not actively working on the parsing.
Running the frontend development server will normally tell you the port it's listening on. The backend listens on port 80 by default, but it can be changed in config/backend/config.json. Make sure that baseResultsUrl constant in the frontend/src/api.js:1 points to the correct URL address in order to frontend to communicate with backend.
First, go through config/ folder and configure all the *.example.json files accordingly. You'll need to have your Mattermost API, Mailerlite API, SMTP server details, and Google Spreasheets details on hand.
Create an empty folder in the root named data/.
In case you're starting completely from scratch, you'll need your Google Forms submission form and the Spreadsheet linked to it. You can ask AvoinGLAM folks for a template or create a form and rewire it importer/src/import/parser/rawRow.js yourself.
- Go to the Google Developers Console
- Select your project or create a new one (and then select it)
- Enable the Sheets API for your project
- In the sidebar on the left, select APIs & Services > Library
- Search for "sheets"
- Click on "Google Sheets API"
- click the blue "Enable" button
- Follow steps above to set up project and enable sheets API
- Create a service account for your project
- In the sidebar on the left, select APIs & Services > Credentials
- Click blue "+ CREATE CREDENITALS" and select "Service account" option
- Enter name, description, click "CREATE"
- You can skip permissions, click "CONTINUE"
- Click "+ CREATE KEY" button
- Select the "JSON" key type option
- Click "Create" button
- your JSON key file is generated and downloaded to your machine (it is the only copy!)
- click "DONE"
- note your service account's email address (also available in the JSON key file)
- Share the Google Sheet (that is connected to the Forms) with your service account using the email noted above (important)
- Move the JSON key file downloaded in the previous step to
config/google-key.json - Create a JSON file
config/config.json, with the following content:
{
"spreadsheet": "SPREADSHEET_ID_HERE"
}You can obtain the spreadsheet ID from the URL-address of the spreadsheet, which looks like this: https://docs.google.com/spreadsheets/d/[SPREADSHEET_ID_IS_HERE]/edit
The GitHub workflow in this repository automatically builds React and Docker on push event to the production branch.
You can reproduce the build steps manually as following.
- backend
- Build React app
cd frontendnpm run build
- Build Docker image
cd backenddocker build -t docker.pkg.github.com/avoinglam/h4og-dashboard/backend .
- Build React app
- importer
- Build Docker image
cd importerdocker build -t docker.pkg.github.com/avoinglam/h4og-dashboard/importer .
- Build Docker image
Finally you can spin up the images using the docker-compose.yml.
$ docker-compose up
Commons user Sailko, CC BY 3.0 https://creativecommons.org/licenses/by/3.0, via Wikimedia Commons
"Goldener Bilderrahmen - gold picture frame" by eriwst is licensed under CC BY-SA 2.0
Check, that:
- On the networking tab of your browser developer tools: API request from the frontend to the backend is responded to with status code 200 OK
data/data.jsonis not empty- Dump backend's data variable to debug the issue
Error: invalid_grant: Invalid JWT: Token must be a short-lived token (60 minutes) and in a reasonable timeframe. Check your iat and exp values in the JWT claim.
Check if the clock in the Docker container is off. You can re-sync the clock by running the following command:
docker run --rm --privileged node:lts-alpine hwclock -s