diff --git a/_posts/2025-10-04-spec-kit.md b/_posts/2025-10-04-spec-kit.md new file mode 100644 index 0000000..6dda6f3 --- /dev/null +++ b/_posts/2025-10-04-spec-kit.md @@ -0,0 +1,313 @@ +--- +layout: post +comments: true +title: "Building with Spec Kit" +excerpt: "A step-by-step guide to using Spec Kit for spec-driven development, demonstrated by building a complete podcast website with an AI assistant." +categories: [AI, Software Development, GitHub] +tags: [Spec Kit, Spec-Driven Development, LLM, Copilot, Next.js] +toc: true +img_excerpt: +mermaid: true +--- + +[Spec-driven development](https://github.blog/ai-and-ml/generative-ai/spec-driven-development-using-markdown-as-a-programming-language-when-building-with-ai/) is a methodology that emphasizes on defining a detailed specification for an application or feature before writing any code. This approach involves providing a Large Language Model (LLM) with a comprehensive set of instructions, constraints, and goals. Then, the LLM uses this "spec" to generate the application code, ensuring the final product aligns with the initial vision. +The core idea is that you spend upfront some amount of time to go and define that and then have the LLM build exactly what you wanted per specification. + +[Spec Kit](`github.com/github/spec-kit`) is an open-source tool developed at Microsoft and designed to facilitate the Spec-driven development process. It provides a set of templates and a command-line interface (CLI) called `specify` to structure and streamline the creation of these specifications. + +The rest of this article walks through how to leverage **Spec Kit** with VS Code to build a simple web application. + +### Core Components of the Spec Kit Approach + +The Spec Kit methodology is built around four key prompting documents, each serving a distinct purpose in guiding the AI. + +* **Constitution.md:** This document establishes the "non-negotiable principles" and constraints for your project. It's where you define the foundational rules that the AI must follow in every task. +* **Spec.md:** This is the feature specification, analogous to a Product Requirements Document (PRD). It focuses on the **what** and the **why** of the feature you are building, not the technical implementation. It is generated and maintained by the `/specify` command. +* **Plan.md:** The plan translates the "what" and "why" from the spec into the **how**. It outlines the technical approach for building the feature, taking into account the rules defined in the constitution. It is generated and maintained by the `/plan` command. +* **Tasks.md:** This final document breaks down the high-level plan into a series of small, concrete, and actionable tasks for the AI to execute. This granular breakdown is crucial for guiding the AI effectively.It is generated and maintained by the `/tasks` command. + + +### Spec Kit Workflow Diagram + +The diagram below illustrates the different stages of the Spec Kit workflow, from initial idea to a functional application. Each stage builds upon the previous one, and as such creating a clear and structured path for the AI to follow. + +1. **Initialization**: The process begins with a project idea. You run `specify init`, which bootstraps a project by creating the foundational documents from a set of templaces. Most importantly, it generates `constitution.md`, the document that will contain the core principles and constraints for the AI, along with configuration files in the `.specify/` and `.github/prompts/` directories. + +2. **Specification (The What)**: Next, you define the feature's requirements using the `/specify` command. You provide a high-level prompt describing what you want to build (e.g., "I want a podcast site..."). This generates `spec.md`, which details the user stories, functional requirements, and acceptance criteria. This document is a living blueprint that you can refine until it accurately captures the feature's purpose. + +3. **Planning (The How)**: The next step is to create a technical plan. Using the `/plan` command, you provide technical direction (e.g., "Use Next.js, mock data..."). The AI assistant then generates `plan.md`, a technical document that outlines the architecture, dependencies, and file structure. Crucially, this plan must adhere to the rules established in the `constitution.md`. + +4. **Task Breakdown**: With the "what" and "how" defined, the `/tasks` command is used to break down the technical plan into a series of small, actionable steps. This generates `tasks.md`, which serves as a granular checklist for the AI. This file lists concrete actions like "Set up linting" or "Create UI components," providing a clear, step-by-step path for implementation. + +5. **Execution**: Finally, you instruct the AI assistant to `implement the tasks` and it will follow the checklist in `tasks.md` to write code. This stage is iterative; you review the generated code, provide feedback, and repeat until the application is complete and meets all the requirements outlined in the `spec.md`. + +```mermaid +graph TD + subgraph "1. Initialization" + A[Start: Project Idea] --> B(specify init); + B -- "Generates" --> C[constitution.md
.specify/
.github/prompts/]; + end + + subgraph "2. Specification (The What)" + C --> D{/specify}; + D -- "User Prompt" --> E["I want a podcast site..."]; + E -- "Generates" --> F[spec.md]; + F -- "Contains" --> G["User Stories
Requirements
Acceptance Criteria"]; + G --> H(Refine Spec); + end + + subgraph "3. Planning (The How)" + H --> I{/plan}; + I -- "User Prompt" --> J["Use Next.js, mock data..."]; + J -- "Generates" --> K[plan.md]; + K -- "Respects" --> C; + K -- "Contains" --> L["Architecture
Dependencies
File Structure"]; + end + + subgraph "4. Task Breakdown" + L --> M{/tasks}; + M -- "Generates" --> N[tasks.md]; + N -- "Contains" --> O["Granular checklist
e.g., Setup linting, Create components"]; + end + + subgraph "5. Execution" + O --> P{implement the tasks}; + P --> Q[AI Writes Code]; + Q --> R(Review & Iterate); + R --> S[Finish: Functional App]; + end +``` + + +This structured workflow ensures that the final application is a direct translation of the initial specification, guided by a consistent set of principles and a well-defined technical plan. + +### A Practical Example + +This section walks through a practical example of using Spec Kit with an AI assistant (like GitHub Copilot) to build a web application (a podcast landing page) from scratch. + +#### 1. Installation and Initialization + +First, install the `specify` CLI and initialize a project with `uvx specify init "pod site"`. This will start an interactive setup, prompting the selection of an AI assistant of choice (e.g., Copilot) and helper script language (e.g., PowerShell or Bash). +It will then scaffolds the necessary template files in the new project, including the `.specify` and `.github/prompts` directories. + +![Installing and initializing the specify CLI]({{ "/assets/2025/10/20251004-specify.png" | absolute_url }}){: .center-image } + + +#### 2. Define Your Constitution +One of the important files generated by the setup is the `constitution.md`, under the `.specify/memory` folder which, establishes the project's non-negotiable principles. These principles will guide the AI assistant with subsequent code generation and thus must be updated to match the project purpose. + +But instead of editing it manually, you can leverage the AI assistant to edit it by prompting it with something like + +``` +Let's update this constitution for a web application set of constraints. +``` + +The AI assistant takes this prompt and updates the constitution with rules that are more suitable for a web application, e.g. "User-Centric & Accessibility First" and "Secure by Design." + + + +#### 3. Specify Your Feature + +Next, use the `/specify` command to create a feature specification, focusing on the *what* and *why*, not the *how*. For example: + +``` +/specify I am building a podcast landing page for VS Code Insider. Make it modern, dark theme, use featured speackers on the main page for featured conversations. Allow discovery of related episodes once I go to the Episodes page. Every episode page has detailed transcript (mock that data) and there should be at least 20 mock episodes. +``` + +This will make the AI assistant create a new `specs/` folder with a subfolder named after the feature (e.g. `001-i-am-building`) and generates a `spec.md` file within it. This document includes sections for user stories, functional requirements, and acceptance criteria. + +Note that the generated file may contain `[NEEDS CLARIFICATION]` markers for ambiguities. Before going any further, such ambiguities need to be addressed. You can ask the AI assistant to refining the `spec.md` and resolve these ambiguities. For example, by prompting: + +``` +Fill in the clarification items as best as you think +``` + +or + +``` +Review the acceptance checklist and then update it in the spec +``` + +The AI assistant will then update the spec, making reasonable assumptions to create a more robust document. Such further review can add crucial sections like "Out of Scope," "Success Metrics," and "Risks", making the spec even clearer. + + +#### 4. Create a Technical Plan + +Once the specs are finalized, use the `/plan` command to translate them into a technical blueprint. This is where you define the *how*. For example, use the prompt: + +``` +/plan use Next.js, all data is mocked - no database or auth +``` + +The assistant will then generate a `plan.md` and `research.md` files, which detail the technical architecture, project structure, dependencies (e.g. Next.js, TypeScript), and testing strategies, all while respecting the rules laid out in the constitution. + + + +#### 5. Break Down the Plan into Tasks + +Next, use the `/tasks` command to break down the previous plan into a granular, actionable checklist for the AI assistant to implement. Example prompt: + +``` +/tasks break down the plan into tasks +``` + +The assistant will analyze the `plan.md` and generate a `tasks.md` file containing a series of small, concrete steps, and ordered with dependencies in mind, e.g., "Set up linting," "Create failing test stubs for components," "Implement core data structures," "Build UI components". + +
+Example generated tasks.md + +
+ + # Tasks: VS Code Insider Podcast Landing (Mock Next.js) + + **Input**: Design documents from `/specs/001-i-am-building/` (plan.md) + **Prerequisites**: plan.md (available). No data-model.md, contracts/, or research.md present. All endpoints are static pages (no API contracts). All data mocked. + + ## Execution Flow (main) + ``` + 1. Load plan.md (done) + 2. No optional docs → skip contract/entity extraction (entities inferred from plan Episode model) + 3. Generate tasks: Setup → Tests (failing first) → Core Components → Pages → Integration (a11y/perf) → Polish + 4. Apply rules: Different files → [P]; same file sequence unmarked + 5. Number tasks (T001..) + 6. Provide dependency graph + parallel batches + 7. Validate completeness (entities, user journeys covered by integration tests) + 8. Return SUCCESS + ``` + + ## Format: `[ID] [P?] Description` + [P] indicates can run in parallel (distinct files / no dependency). + + ## Phase 1: Setup + - [ ] T001 Initialize Next.js + TypeScript project structure (already scaffolded) – verify `package.json`, `tsconfig.json`, `next.config.mjs` match plan. + - [ ] T002 Add lint & type scripts enforcement (ESLint config extension if needed) in `.eslintrc.json` and ensure `npm run lint` passes. + - [ ] T003 [P] Add basic Vitest + RTL test setup in `tests/setup.ts` (jest-dom, axe optional comment) & update `package.json` test script. + - [ ] T004 [P] Add `tests/README.md` documenting test layers (unit, component, a11y) referencing plan section 12. + + ## Phase 2: Tests First (TDD) – MUST FAIL INITIALLY + Integration stories (derived from Acceptance Scenarios & FRs) before implementing missing logic. + - [ ] T005 Create integration test: landing shows 3–6 featured unique primary speakers in `tests/integration/landing.featured.test.tsx` (assert uniqueness rule & count range). (FR-002/023) + - [ ] T006 [P] Integration test: episodes listing shows ≥20 items & default sort newest in `tests/integration/episodes.list.test.tsx` (FR-005/017) + - [ ] T007 [P] Integration test: filter by single Tag reduces set & resets on clearing in `tests/integration/episodes.filter.tag.test.tsx` (FR-017) + - [ ] T008 [P] Integration test: filter by Speaker works similarly in `tests/integration/episodes.filter.speaker.test.tsx` (FR-017) + - [ ] T009 [P] Integration test: episode detail shows transcript + expansion control only when > threshold in `tests/integration/episode.transcript.test.tsx` (FR-008/019/021) + - [ ] T010 [P] Integration test: related episodes shows 3 or fallback message in `tests/integration/episode.related.test.tsx` (FR-009/010/022) + - [ ] T011 [P] Integration test: navigation continuity (back stack) from related episode to previous detail in `tests/integration/navigation.explore-continuity.test.tsx` (FR-015) + - [ ] T012 [P] Accessibility smoke: landmark roles + skip link focus + contrast token presence in `tests/a11y/landing.a11y.test.tsx` (FR-024/025) + - [ ] T013 [P] Performance marks presence test (mock) verifying `performance.mark` names exist in `tests/integration/perf.marks.test.ts` (Metrics / FR-001) + + ## Phase 3: Core Data & Utilities (after failing tests exist) + - [ ] T014 Implement catalog utilities & indexing in `lib/catalog.ts` (overlap scoring, related fallback) – ensure tests start passing for related logic. + - [ ] T015 [P] Implement filter helpers in `lib/filters.ts` including sort logic (newest/oldest) & tag/speaker single-select. + - [ ] T016 [P] Implement transcript helper in `lib/transcript.ts` (isCollapsible) enforcing thresholds. + - [ ] T017 Validate dataset rules via a script `scripts/validate-episodes.mjs` (counts, featured uniqueness, threshold flags) and add `npm run validate:data`. + + ## Phase 4: Components (UI Building Blocks) + - [ ] T018 Create `app/_components/EpisodeCard.tsx` (card metadata layout) – test reuse via integration tests. + - [ ] T019 [P] Create `app/_components/FeaturedConversations.tsx` (filters featured & uniqueness) per FR-002/023. + - [ ] T020 [P] Create `app/_components/FiltersBar.tsx` (tag, speaker, sort controls) per FR-017. + - [ ] T021 [P] Create `app/_components/Transcript.tsx` (collapse/expand, focus restore) per FR-008/019/021. + - [ ] T022 [P] Create `app/_components/RelatedEpisodes.tsx` (3 or fallback) per FR-009/010. + - [ ] T023 [P] Create skeleton components (EpisodeCardSkeleton, TranscriptSkeleton, FeaturedSkeleton) in `app/_components/skeletons/` per FR-016. + + ## Phase 5: Pages & Layout + - [ ] T024 Assemble landing `app/page.tsx` (hero, FeaturedConversations, recent episodes slice) per FR-001/002/005. + - [ ] T025 [P] Assemble episodes listing `app/episodes/page.tsx` using FiltersBar + EpisodeCard grid per FR-005/017. + - [ ] T026 Assemble episode detail `app/episodes/[slug]/page.tsx` (metadata, transcript, related) per FR-007/008/009/010/019. + - [ ] T027 Add breadcrumbs / navigation continuity enhancements (if not already present) in layout or detail page per FR-012/015. + + ## Phase 6: Integration / Accessibility / Performance + - [ ] T028 Add performance marks (`landing-skeleton`, `landing-first-content`, `detail-transcript-mounted`) in relevant components. + - [ ] T029 [P] Accessibility refinements: ensure ARIA labels, roles, and focus visible outlines; update any missing alt text. + - [ ] T030 [P] Add validation script output documentation `VALIDATION.md` capturing success metrics results. + + ## Phase 7: Polish + - [ ] T031 Add unit tests for catalog, filters, transcript (word threshold) in `tests/unit/` (FR-009/017/019). + - [ ] T032 [P] Add component tests for Transcript expand/collapse & Related fallback with jest-axe checks in `tests/component/`. + - [ ] T033 [P] Add README updates (metrics section + how to run validation) in root `README.md`. + - [ ] T034 [P] Light refactor pass removing duplication (shared tag rendering) & ensure strict TypeScript passes. + - [ ] T035 Final accessibility manual checklist & record in `VALIDATION.md`. + - [ ] T036 Prepare release notes summary in `specs/001-i-am-building/VALIDATION.md` linking back to tasks. + + ## Dependencies + - T001 → T002/T003/T004 + - Tests (T005–T013) must exist & fail before implementing T014–T026 + - T014 precedes T022 (shared related logic) & T024–T026 + - T015 precedes T020 & T025 + - T016 precedes T021 & T026 + - Components (T018–T022) precede pages T024–T026 + - Skeletons (T023) precede perf marks T028 if marks rely on skeleton mount + - T028 depends on pages assembled (T024–T026) + - Polish tasks (T031–T036) depend on prior phases + + ## Parallel Execution Examples + ``` + # Batch 1 (after T001): + Task: T002 (lint setup) + Task: T003 (test harness) [P] + Task: T004 (tests README) [P] + + # Batch 2 (tests phase – all parallel) after T004: + Tasks: T005 T006 T007 T008 T009 T010 T011 T012 T013 (all [P]) + + # Batch 3 (core utilities) after failing tests present: + Tasks: T015 T016 (parallel) while T014 starts first (sequential due to catalog central role) + + # Batch 4 (components parallel) after T014–T016: + Tasks: T019 T020 T021 T022 T023 (parallel) while T018 done first (EpisodeCard dependency) + + # Batch 5 (pages) after component batch: + Tasks: T024 T025 (parallel) then T026 (needs transcript + related + card) then T027 + + # Batch 6 (integration/perf/a11y) after pages: + Tasks: T028 T029 T030 (parallel) + + # Batch 7 (polish) after integration: + Tasks: T031 T032 T033 T034 T035 (parallel where file isolation) then T036 last summarizing + ``` + + ## Validation Checklist + - [ ] All integration tests (T005–T013) authored before implementation files modified + - [ ] Episode entity covered by catalog + transcript + related logic tasks + - [ ] No [P] tasks mutate same file concurrently + - [ ] Success metrics captured in VALIDATION.md (T030, T036) + - [ ] Accessibility criteria verified (T029, T035) + + ## Notes + - No API contracts; tasks emphasize UI & data logic. + - Data model implicit; single `Episode` entity plus derived relationships. + - Adjust if additional docs (data-model.md, contracts/) are added later. +
+ +
+ +#### 6. Execute and Review + +Finally, instruct the AI assistant to start working on the implementation by prompting it with: + +``` +implement the tasks +``` + +The assistant will follow the `tasks.md` checklist, writing code, creating files, and building the application step-by-step. In our example application, it will scaffold a complete Next.js application, including components, pages, tests, and mock data, turning the detailed specification into a functional podcast website. + +Note: the process is iterative; you can review the AI's work, provide feedback, and guide it until the final product meets the requirements defined in the spec. + + +### Benefits of the Spec Kit Approach + +The Spec Kit approach offers several significant benefits, promoting a more structured and efficient development workflow. One of the core advantages is the enforcement of **consistency and standardization**. By defining a constitution, organizations can maintain uniform standards across hundreds of applications. This allows engineers to move between projects seamlessly without the need to relearn different technology stacks and coding conventions. + +Another powerful aspect is the **flexibility and reusability** that comes from separating the *what* (the spec) from the *how* (the plan). This distinction means that a feature's specification can remain constant even if the underlying technology changes. For instance, if a team decides to migrate from React to ASP.NET Core, they can reuse the existing spec to generate a new implementation without starting from scratch. + +This methodology also fosters **improved collaboration**. The spec serves as a "living, breathing document" that acts as the single source of truth for a feature's requirements and functionality. This ensures that everyone on the team has a shared understanding of the goals and can refer back to a consistent reference point. + +Finally, the Spec Kit approach drives **efficient development**. While it requires an upfront investment in planning, this structured process avoids the pitfalls of directionless coding. The granular tasks created from the plan help steer the AI in the right direction from the outset, minimizing the time spent correcting incorrect assumptions and ensuring the final product aligns with the initial vision. + +### Conclusion + +The Spec Kit methodology provides a powerful framework for leveraging AI assistants in software development. By separating concerns into a constitution, spec, plan, and tasks, it creates a structured, repeatable, and scalable process. As demonstrated here, this approach allows you to guide effectively an AI assistant to build a complete, well-architected application that aligns precisely with your vision. + +--- + +_I hope you enjoyed this article, feel free to leave a comment or reach out on twitter [@bachiirc](https://twitter.com/bachiirc)._ diff --git a/assets/2025/10/20251004-specify.png b/assets/2025/10/20251004-specify.png new file mode 100644 index 0000000..f7977d0 Binary files /dev/null and b/assets/2025/10/20251004-specify.png differ