feat(chat): integrate vector-backed memory system with LLM tools #183

basnijholt · 2026-01-04T08:01:38Z

Summary

Integrates the vector-backed memory system (agent_cli/memory/MemoryClient) into the chat agent via LLM tools. The LLM can explicitly store and retrieve memories, with user consent. Also supports automatic memory extraction mode.

Changes

config.py: Added Memory dataclass for memory configuration with 3 modes (off, tools, auto)
opts.py: Added CLI options (--memory-mode, --memory-path, --memory-top-k, etc.)
_tools.py: Refactored to class-based MemoryTools bound to MemoryClient
chat.py: Initialize MemoryClient on startup, pass to tools, automatic extraction in auto mode
docs/: Updated documentation for memory system
tests/: Added comprehensive tests for MemoryTools class

Memory Modes

Mode	Behavior
`off`	Memory system disabled
`tools` (default)	LLM decides when to store/retrieve via tools
`auto`	Automatic extraction after each conversation turn

Memory Tools

Tool	Description
`add_memory`	Store information (LLM asks permission first)
`search_memory`	Semantic search across stored memories
`list_all_memories`	List all stored memories

CLI Options

Option	Default	Description
`--memory-mode`	`tools`	Memory mode: off, tools, or auto
`--memory-path`	Auto	Path for memory database storage
`--embedding-model`	`text-embedding-3-small`	Embedding model for semantic search
`--memory-top-k`	`5`	Number of memories to retrieve
`--memory-score-threshold`	`0.35`	Minimum relevance score

How it Works

Memory system auto-initializes if [memory] extra is installed
In tools mode: LLM has access to memory tools and decides when to use them
In auto mode: Facts are automatically extracted after each conversation turn
Falls back gracefully if memory extra not installed

Test Plan

All tests pass (including new MemoryTools tests)
Pre-commit checks pass
Manual test: memory system initializes
Manual test: LLM asks permission before storing (tools mode)

Closes #158

Replace the simple JSON-based memory in the chat agent with the advanced MemoryClient that provides semantic search via ChromaDB and embeddings. Changes: - Add AdvancedMemory config dataclass with memory system options - Add CLI options: --advanced-memory, --memory-path, --memory-top-k, etc. - Refactor _tools.py to support dual backends (simple JSON vs advanced) - Initialize MemoryClient in chat agent when --advanced-memory enabled - Auto-fallback to simple memory if [memory] extra not installed The advanced memory system is enabled by default and provides: - Semantic search using vector embeddings - MMR retrieval for diverse results - Recency weighting and score thresholds - Automatic fact extraction and reconciliation Closes #158

- Add Memory System section to chat.md explaining the new advanced memory feature with semantic search - Add cross-links between chat, memory, and architecture docs - Regenerate auto-generated options tables to include new Memory Options

Remove the simple JSON-based memory system, keeping only the vector-backed MemoryClient. This simplifies the codebase by eliminating the dual-backend logic and the --advanced-memory flag. - Rename AdvancedMemory config to Memory, remove enabled field - Remove all simple memory functions from _tools.py - Rename init_advanced_memory/cleanup_advanced_memory to init_memory/cleanup_memory - Update chat.py to use simplified memory initialization - Update documentation to remove "advanced" terminology - Remove obsolete test_memory_tools.py

- Replace closure-based memory tools with MemoryTools class - Pass memory_client and conversation_id directly to tools() - Remove module-level globals (_memory_client, _conversation_id) - Remove init_memory/cleanup_memory lifecycle functions - Update chat.py to handle memory client lifecycle directly - Add proper type hints using TYPE_CHECKING imports - Update tests to pass new required parameters

The tool was misleading - it counted entries by internal role (memory, user, assistant, summary) rather than user-facing categories (personal, preferences, facts, etc.).

basnijholt · 2026-01-04T08:02:24Z

agent_cli/agents/chat.py

+
+    Uses a hash of the history directory path to ensure consistency across sessions.
+    """
+    import hashlib  # noqa: PLC0415


Import on top

basnijholt · 2026-01-04T08:36:04Z

agent_cli/agents/chat.py

+
+    Returns the MemoryClient if successful, None otherwise.
+    """
+    from agent_cli.memory.client import MemoryClient as MemoryClientImpl  # noqa: PLC0415


Just normal please

Add memory mode selection with three options: - off: Memory system disabled - tools: LLM decides via add_memory/search_memory tools (default) - auto: Automatic fact extraction after each conversation turn The modes are mutually exclusive to avoid duplicate memory storage. In "auto" mode, facts are automatically extracted from both user and assistant messages without requiring explicit tool calls. Resolves #184 as part of #183

…ify MemoryClient import

Pass the user's configured openai model to extract_from_turn() instead of using the default gpt-5-mini, which may not be available on custom OpenAI-compatible endpoints.

- Fix opts.with_default type hint (str → Any) for bool support - Fix FBT003 lint errors by using keyword arg default=True - Fix tests using old --no-git-versioning option name - Add comprehensive tests for MemoryTools class (30 tests) - Document memory modes (off/tools/auto) in chat.md

…ion)

In "auto" mode, the LLM now has read-only access to memory tools (search_memory, list_all_memories) while extraction still happens automatically. Previously, auto mode disabled all memory access for the LLM, meaning stored facts couldn't be searched. Also added read_only parameter to create_memory_tools() and memory_read_only parameter to tools() function with tests.

In "auto" mode, relevant memories are now automatically retrieved and injected into the system prompt before each LLM call. This mirrors the memory-proxy behavior but only in auto mode. Memory mode behavior: - off: No memory at all - tools: LLM has full control via tools (no auto-injection) - auto: Auto-inject + read-only tools + auto-extract after turn

basnijholt and others added 11 commits January 4, 2026 00:01

Merge 052cbdd into 853ea9d

6006d13

Update auto-generated docs

06fd1ac

Merge c432bf9 into 853ea9d

da636d1

Update auto-generated docs

41c2f41

Merge 0557d36 into 853ea9d

7acdf11

Update auto-generated docs

731fac7

refactor(chat): remove list_memory_categories tool

f40fa09

The tool was misleading - it counted entries by internal role (memory, user, assistant, summary) rather than user-facing categories (personal, preferences, facts, etc.).

basnijholt mentioned this pull request Jan 4, 2026

feat(chat): add automatic memory extraction mode #184

Closed

basnijholt changed the title ~~feat(chat): integrate advanced vector-backed memory system~~ feat(chat): integrate vector-backed memory system with LLM tools Jan 4, 2026

basnijholt commented Jan 4, 2026

View reviewed changes

basnijholt and others added 15 commits January 4, 2026 01:10

Merge 05edfdd into d790558

e0e8d86

Update auto-generated docs

868cb7e

refactor: address review comments - move hashlib import to top, simpl…

d1ff04a

…ify MemoryClient import

docs: regenerate auto-generated CLI help sections

73a07f3

Merge 73a07f3 into d790558

b0eabde

Update auto-generated docs

29cab5a

fix(chat): use configured LLM model for memory extraction

d6fc660

Pass the user's configured openai model to extract_from_turn() instead of using the default gpt-5-mini, which may not be available on custom OpenAI-compatible endpoints.

Merge 6b48758 into d790558

53539cb

Update auto-generated docs

0b2a749

fix(tests): extend timeout for memory integration tests on Windows

d104dcb

fix(chat): run memory extraction in background (don't block conversat…

718cb70

…ion)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(chat): integrate vector-backed memory system with LLM tools #183

feat(chat): integrate vector-backed memory system with LLM tools #183

Uh oh!

basnijholt commented Jan 4, 2026 •

edited

Loading

Uh oh!

basnijholt Jan 4, 2026

Uh oh!

basnijholt Jan 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(chat): integrate vector-backed memory system with LLM tools #183

Are you sure you want to change the base?

feat(chat): integrate vector-backed memory system with LLM tools #183

Uh oh!

Conversation

basnijholt commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Memory Modes

Memory Tools

CLI Options

How it Works

Test Plan

Uh oh!

basnijholt Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

basnijholt Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

basnijholt commented Jan 4, 2026 •

edited

Loading