Skip to content

A lightweight, cross-platform Python utility designed to bridge the gap between Large Language Models (LLMs) and your local file system

License

Notifications You must be signed in to change notification settings

pkeffect/directory-structure-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Directory Controller

Version Python Platform License Dependencies

Directory Controller is a lightweight, cross-platform Python utility designed to bridge the gap between Large Language Models (LLMs) and your local file system.

It solves two specific problems:

  1. Context Creation: Turning your existing project structure into a clean text file to provide context to an LLM.
  2. Scaffolding: Taking a text-based directory tree generated by an LLM (ChatGPT, Claude, DeepSeek) and instantly building the actual folders and empty files.

πŸš€ Key Features

Core Functionality

  • Universal Parsing: Intelligently parses various tree formats (Standard Tree β”œβ”€β”€, ASCII +--, Markdown Lists -, or Tab indentation).
  • Smart Detection: Automatically finds structure files (e.g., structure.txt, tree.md) without needing specific filenames.
  • Auto-Scaffolding: Creates directories and empty placeholder files based on the input map.
  • Context-Aware: Uses heuristics (file extensions, trailing slashes, indentation lookahead) to distinguish files from folders even if the LLM output is messy.
  • Native Windows Support: Now includes a fix to force ANSI color codes to render correctly in the Windows Command Prompt (cmd.exe) and PowerShell.

New in v0.0.8 (Enterprise-Grade Safety)

  • Interactive Confirmation: Before modifying your filesystem (Option 2), the script provides a summary of detected nodes and explicitly asks for user confirmation (y/n). This prevents accidental execution on the wrong file.
  • Gitignore Integration: Respects your project's .gitignore file during scanning, automatically excluding build artifacts, logs, secrets, and temp files. Checks both filenames and relative paths.
  • Infinite Loop Protection (Symlinks): The scanner now detects Symbolic Links. It identifies them in the output (e.g., link -> target) but stops recursion immediately. This prevents the script from freezing on circular file paths or duplicating massive libraries.
  • Smart Collision Resolution: If you try to move a file to a folder where it already exists, the script will not overwrite it. Instead, it automatically renames the incoming file (e.g., script.py β†’ script_1.py) to ensure zero data loss.
  • Binary & Encoding Resilience: The file reader now detects binary files (like images or compiled executables) and skips them instead of crashing. It also cycles through multiple encodings (utf-8-sig, latin-1) to handle files created in different environments (e.g., Windows Notepad).
  • Root-Wrap Prevention: Detects if the LLM has hallucinated a root folder that matches your current directory name. It automatically un-nests the structure to prevent project/project/src redundancies.
  • Security Sanitization: Aggressively filters file paths to remove directory traversal attacks (../) and characters invalid on Windows (<>:"/\|?*).

πŸ“¦ Installation

  1. Ensure you have Python 3.6+ installed.
  2. Download directory-control.py.
  3. Place the script in the root folder of your project.

πŸ“– Usage

Run the script from your terminal:

python directory-control.py

You will be presented with two options:

Option 1: SCAN & GENERATE

Best for: Giving context to an LLM.

This scans your current directory to generate a file named directory-structure.txt.

  • Clean Output: It automatically filters out standard system junk (.git, __pycache__, node_modules).
  • Smart Ignoring: It reads your local .gitignore file (if present) and excludes any matching patterns (e.g., *.log, dist/, coverage/) so you don't overwhelm the LLM with irrelevant file paths.
  • Use Case: Copy the content of this text file and paste it into ChatGPT/Claude so the AI understands your current project structure.

Option 2: READ & BUILD

Best for: Applying an LLM's architecture.

This looks for a structure file (e.g., structure.txt, tree.md) and builds the file system.

  1. Ask an LLM to generate a project structure.
  2. Save the LLM's output into a text file (e.g., structure.txt) in your project root.
  3. Run the script and select Option 2.
  4. Verification: The script will parse the file and display a summary (Working Root, Number of Parsed Items) and a warning.
  5. Confirmation: You must type y to proceed.
  6. The script will:
    • Analyze: Parse the text file and identify nodes.
    • Sanitize: Clean up invalid characters and resolve root-wrapping issues.
    • Execute: Create missing directories, create empty placeholder files, and move existing files to their new locations (with collision protection).

🧠 Supported Formats

The script uses a Robust Parsing Engine that ignores decorative characters and focuses on indentation. It supports all of the following styles:

Style A: Standard Tree (Common in Linux/DOS)

project/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ main.py
β”‚   └── utils.py
└── README.md

Style B: Markdown Lists (Common in ChatGPT)

- project
    - src
        - main.py
        - utils.py
    - README.md

Style C: ASCII Art (Old school)

project
|-- src
|   +-- main.py
|   +-- utils.py
+-- README.md

Example Output

# Generated Structure
# Files: 57 | Dirs: 11

numbers/
β”œβ”€β”€ CHANGELOG.md
β”œβ”€β”€ CONTRIBUTING.md
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ LICENSE.md
β”œβ”€β”€ README.md
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”‚   β”œβ”€β”€ requests.py
β”‚   β”‚   β”‚   └── responses.py
β”‚   β”‚   β”œβ”€β”€ routers/
β”‚   β”‚   β”‚   β”œβ”€β”€ admin.py
β”‚   β”‚   β”‚   β”œβ”€β”€ catalan.py
β”‚   β”‚   β”‚   β”œβ”€β”€ e.py
β”‚   β”‚   β”‚   β”œβ”€β”€ eulers.py
β”‚   β”‚   β”‚   β”œβ”€β”€ general.py
β”‚   β”‚   β”‚   β”œβ”€β”€ legacy.py
β”‚   β”‚   β”‚   β”œβ”€β”€ lemniscate.py
β”‚   β”‚   β”‚   β”œβ”€β”€ log10.py
β”‚   β”‚   β”‚   β”œβ”€β”€ log2.py
β”‚   β”‚   β”‚   β”œβ”€β”€ log3.py
β”‚   β”‚   β”‚   β”œβ”€β”€ phi.py
β”‚   β”‚   β”‚   β”œβ”€β”€ pi.py
β”‚   β”‚   β”‚   β”œβ”€β”€ sqrt2.py
β”‚   β”‚   β”‚   β”œβ”€β”€ sqrt3.py
β”‚   β”‚   β”‚   β”œβ”€β”€ websocket.py
β”‚   β”‚   β”‚   └── zeta3.py
β”‚   β”‚   └── websocket_manager.py
β”‚   β”œβ”€β”€ core/
β”‚   β”‚   β”œβ”€β”€ __init_.py
β”‚   β”‚   β”œβ”€β”€ config.py
β”‚   β”‚   β”œβ”€β”€ constants.py
β”‚   β”‚   β”œβ”€β”€ exceptions.py
β”‚   β”‚   └── redis_client.py
β”‚   β”œβ”€β”€ main.py
β”‚   β”œβ”€β”€ storage/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ base_storage.py
β”‚   β”‚   β”œβ”€β”€ binary_source.py
β”‚   β”‚   β”œβ”€β”€ exceptions.py
β”‚   β”‚   β”œβ”€β”€ file_source.py
β”‚   β”‚   β”œβ”€β”€ manager.py
β”‚   β”‚   β”œβ”€β”€ multi_manager.py
β”‚   β”‚   └── sqlite_source.py
β”‚   └── utils/
β”‚       └── __init__.py
β”œβ”€β”€ compose.yml
β”œβ”€β”€ data/
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ REDIS_WEBSOCKET_QUICK_REF.md
β”‚   └── REDIS_WEBSOCKET_USER_GUIDE.md
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ index.html
β”‚   └── websocket_test.html
β”œβ”€β”€ nginx.conf
β”œβ”€β”€ project.toml
β”œβ”€β”€ rebuild.sh
β”œβ”€β”€ requirements-dev.txt
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ setup_project.sh
β”œβ”€β”€ start.sh
β”œβ”€β”€ stop.sh
β”œβ”€β”€ tests/
β”‚   └── __init__.py
└── update_to_redis_websocket.sh


βš™οΈ How it Works (The Logic)

When parsing a text file, the script must decide if a line represents a File or a Directory. Since LLM output varies, the script uses the following logic hierarchy:

1. Classification Logic

  1. Explicit Syntax: If the name ends with / or \, it is a Directory.
  2. Extension Check: If the name has a file extension (e.g., .py, .json, .js), it is a File.
  3. Indentation Lookahead: If line A is followed by line B, and line B is indented deeper than line A, then line A is treated as a Directory (files cannot contain children).
  4. Common Naming Conventions: If ambiguous, names like src, dist, bin, assets, config, tests are treated as Directories.

2. The "Smart" Security Layers (v0.0.8)

  • Non-Destructive Operations:
    • This script NEVER deletes files.
    • This script NEVER overwrites files. If a move operation conflicts, it renames the file to filename_1.ext.
  • Root Wrapper Detection:
    • Problem: You are in my-app. The LLM outputs a tree starting with my-app/src/....
    • Old Behavior: You ended up with my-app/my-app/src/....
    • New Behavior: The script sees the top-level node matches the CWD name, ignores the top node, and maps src directly to the current directory.
  • Path Sanitization:
    • Leading slashes (/var/www) are stripped to force relative paths (var/www).
    • Windows-invalid characters (<, >, :, ", |, ?, *) are stripped from filenames to prevent OS errors.
    • .. is removed to prevent directory traversal attacks.
  • Symlink & Binary Safety:
    • The scanner detects symbolic links and refuses to follow them, preventing infinite loops.
    • The file reader checks for binary content (Null bytes) to prevent the script from trying to parse images or executables as text.

3. Exclusion Mechanisms

  • System Exclusions: The script explicitly ignores its own source file, .git, node_modules, .env, .DS_Store, and __pycache__ to prevent clutter.
  • Dynamic Exclusions (Gitignore): The script parses your .gitignore using wildcard matching (fnmatch). It checks patterns against the filename (e.g., error.log) AND the relative path (e.g., build/output/main.js).

πŸ€– Prompting Advice

To get the best results from your LLM, use a prompt like this:

"Generate a directory structure for a [Python/Node/etc] project. Please output the structure as a standard ASCII tree or a Markdown list. Ensure folders end with a slash '/'."

Example of ideal input text file:

/my-app
    /backend
        server.py
        config.py
    /frontend
        index.html
        styles.css
    README.md
    .env.example

πŸ“ License

This script is provided "as-is" under the MIT License. It is open source and free to modify.

Disclaimer: Always back up your data before running bulk file operations. While this script includes safety checks and only creates/moves files (never deletes), automated file manipulation always carries inherent risks.

About

A lightweight, cross-platform Python utility designed to bridge the gap between Large Language Models (LLMs) and your local file system

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages