Skip to content

Commit d32ecc7

Browse files
jeremymanningclaude
andcommitted
Add CV auto-generation documentation to README
- Document CV build system architecture and file structure - Explain custom LaTeX-to-HTML parser functionality - List key functions in extract_cv.py with descriptions - Document cv.css stylesheet features - Explain GitHub Actions workflow triggers - Document test coverage (61 tests) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent e1ce5cd commit d32ecc7

File tree

1 file changed

+119
-2
lines changed

1 file changed

+119
-2
lines changed

README.md

Lines changed: 119 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,9 @@ contextlab.github.io/
2525
│ ├── publications/ # Publication thumbnails
2626
│ └── software/ # Software project images
2727
├── documents/
28-
│ └── JRM_CV.pdf # Jeremy Manning's CV
28+
│ ├── JRM_CV.tex # CV LaTeX source (edit this!)
29+
│ ├── JRM_CV.pdf # Generated PDF (auto-built)
30+
│ └── JRM_CV.html # Generated HTML (auto-built)
2931
├── data/ # Content source files (edit these!)
3032
│ ├── publications.xlsx
3133
│ ├── people.xlsx
@@ -35,7 +37,9 @@ contextlab.github.io/
3537
│ ├── people.html
3638
│ └── software.html
3739
├── scripts/ # Build and validation scripts
38-
│ ├── build.py
40+
│ ├── build.py # Content page builder
41+
│ ├── build_cv.py # CV build orchestration
42+
│ ├── extract_cv.py # LaTeX-to-HTML parser
3943
│ └── ...
4044
└── tests/ # Automated tests
4145
└── ...
@@ -274,6 +278,119 @@ You can also manually trigger a build from the [Actions tab](https://github.com/
274278

275279
---
276280

281+
## CV Auto-Generation
282+
283+
Jeremy Manning's CV is automatically compiled from LaTeX source into both PDF and HTML formats. The HTML version matches the PDF styling and includes a download button.
284+
285+
### How It Works
286+
287+
```
288+
documents/
289+
├── JRM_CV.tex # LaTeX source (edit this!)
290+
├── JRM_CV.pdf # Generated PDF
291+
└── JRM_CV.html # Generated HTML
292+
293+
scripts/
294+
├── build_cv.py # Build orchestration
295+
└── extract_cv.py # Custom LaTeX-to-HTML parser
296+
297+
css/
298+
└── cv.css # CV-specific stylesheet
299+
300+
data/
301+
└── DartmouthRuzicka-*.ttf # Dartmouth Ruzicka fonts
302+
303+
.github/workflows/
304+
└── build-cv.yml # GitHub Actions automation
305+
```
306+
307+
### Updating the CV
308+
309+
1. Edit `documents/JRM_CV.tex` in any LaTeX editor
310+
2. Push to GitHub - the CV will be automatically rebuilt
311+
3. Both PDF and HTML versions are generated and committed
312+
313+
### Building Locally
314+
315+
```bash
316+
# Install dependencies
317+
pip install -r requirements-build.txt
318+
319+
# Build CV (requires XeLaTeX)
320+
cd scripts
321+
python build_cv.py
322+
323+
# Run tests
324+
python -m pytest tests/test_build_cv.py -v
325+
```
326+
327+
### Custom LaTeX Parser
328+
329+
The `extract_cv.py` script provides a custom LaTeX-to-HTML converter that handles:
330+
331+
- **Text formatting**: `\textbf`, `\textit`, `\emph`, `\textsc`, `\ul`
332+
- **Links**: `\href{url}{text}``<a href="url">text</a>`
333+
- **Lists**: `etaremune` (reverse-numbered), `itemize`, `enumerate`
334+
- **Multi-column**: `\begin{multicols}{2}` → CSS two-column layout
335+
- **Sections**: `\section*`, `\subsection*` → semantic HTML headings
336+
- **Special characters**: em-dashes, en-dashes, quotes, accented characters
337+
- **Footnotes**: `\blfootnote{}` → section notes displayed inline
338+
- **Block spacing**: `\\[0.1cm]` → visual block separators
339+
340+
### Key Functions in extract_cv.py
341+
342+
| Function | Purpose |
343+
|----------|---------|
344+
| `extract_document_body()` | Extract content between `\begin{document}` and `\end{document}` |
345+
| `balanced_braces_extract()` | Parse nested LaTeX braces correctly |
346+
| `convert_latex_formatting()` | Convert LaTeX commands to HTML |
347+
| `parse_etaremune()` | Parse reverse-numbered publication lists |
348+
| `extract_header_info()` | Extract name and contact information |
349+
| `extract_sections()` | Split document into sections/subsections |
350+
| `render_section_content()` | Convert section content based on type |
351+
| `generate_html()` | Assemble complete HTML document |
352+
353+
### CV Stylesheet (cv.css)
354+
355+
The stylesheet provides:
356+
357+
- **Dartmouth Ruzicka font** via `@font-face` declarations
358+
- **Dartmouth green** color scheme: `rgb(0, 105, 62)`
359+
- **Sticky download bar** at top of page
360+
- **Responsive design** for tablet and mobile
361+
- **Print styles** that match PDF appearance
362+
- **Reverse-numbered lists** using native `<ol reversed>` support
363+
364+
### Automatic Builds (GitHub Actions)
365+
366+
The `build-cv.yml` workflow triggers when you push changes to:
367+
- `documents/JRM_CV.tex`
368+
- `scripts/build_cv.py` or `scripts/extract_cv.py`
369+
- `css/cv.css`
370+
- `.github/workflows/build-cv.yml`
371+
372+
The workflow:
373+
1. Installs TeX Live and Dartmouth Ruzicka fonts
374+
2. Compiles LaTeX to PDF using XeLaTeX
375+
3. Converts LaTeX to HTML using the custom parser
376+
4. Runs 61 automated tests
377+
5. Commits and pushes the generated files
378+
379+
### Testing
380+
381+
The test suite (`tests/test_build_cv.py`) includes 61 tests covering:
382+
383+
- LaTeX formatting conversion
384+
- Balanced brace parsing
385+
- Section extraction
386+
- HTML generation
387+
- PDF compilation
388+
- Content validation
389+
- Link validation
390+
- Edge cases
391+
392+
---
393+
277394
## Adding Content (Legacy/Manual Method)
278395

279396
> **Note:** For publications, people, and software pages, use the spreadsheet method above. The manual method below is for other pages or special cases.

0 commit comments

Comments
 (0)