A powerful research agent that can perform comprehensive, in-depth investigations on any topic or analyze specific websites. The tool uses advanced web crawling techniques and AI to gather, synthesize, and present information in a structured format.
- Two Modes: Research mode for investigating topics and Crawl mode for analyzing specific websites
- Clarifying Questions: The app asks intelligent questions to better understand your research needs
- Real-time Progress: View the research process as it happens with live logs
- Rich Results: View research results as formatted markdown, HTML, or download for later use
- Customizable Parameters: Control the depth and breadth of research
- Structured Output: Results are saved in both Markdown and JSON formats
- Python 3.8+
- OpenAI API key
- Required packages (see installation section)
- Clone this repository:
git clone https://github.com/yourusername/deep-research-agent.git
cd deep-research-agent- Install required packages:
pip install -r requirements.txt
# or Install required packages using the Makefile:
make install- Create a
.envfile in the project root and add your OpenAI API key:OPENAI_API_KEY=your_api_key_here
Run the web interface with:
streamlit run app.py
# or use the make command
make runThis will launch a browser-based interface at http://localhost:8501 where you can:
- Select "Research Query" mode in the sidebar
- Enter your research question in the text area
- Adjust the research parameters in the sidebar (depth, breadth, iterations)
- Click "Start Research"
- Answer the clarifying questions (if any)
- Watch the research process in real-time
- View and download the results
- Select "Website Crawl" mode in the sidebar
- Enter the URL you want to analyze
- Adjust the crawl parameters in the sidebar (depth, max pages)
- Click "Start Crawling"
- Watch the crawling process in real-time
- View and download the results
For the command line interface, use main.py with appropriate arguments:
python main.py --query "your research question here" --depth 2 --breadth 5 --iterations 3python main.py --url "https://example.com" --depth 2 --breadth 5python main.py --query "想调研一下奔驰汽车E级,S级,以及迈巴赫这些所有型号的车车,23-25年的外观,内置,体验上的趋势特点以及变化,想通过的出来的结论用到洗衣机的外观设计上" --depth 3 --breadth 6--query/-q: The research question or topic to investigate--url/-u: Starting URL for website crawling (alternative to query)--depth/-d: How many levels deep to crawl from each page (default: 2)--breadth/-b: Number of top results to explore or max pages for website crawl (default: 5)--iterations/-i: Number of research cycles to perform (default: 3, research mode only)
- Be specific in your research questions
- Provide answers to clarifying questions for more focused research
- Adjust depth and breadth based on your needs:
- Higher depth explores more links from each page
- Higher breadth analyzes more search results per query
- More iterations enable more thorough research
The tool generates two output files:
- A Markdown file with formatted research findings
- A JSON file containing raw data for further processing
The Deep Research Agent uses a combination of web crawling (via Crawl4AI) and large language models to:
- Perform initial searches based on your query
- Extract relevant information from search results
- Generate follow-up questions to dive deeper
- Synthesize findings into a comprehensive report
- If the app fails to start, check that you have set the OPENAI_API_KEY properly
- If research seems slow, try reducing depth and breadth parameters
- If you encounter errors, check the console output for more details
make run - Run the Streamlit web application
make research - Run the Deep Research Agent from command line
make clean - Clean up cached files and temporary data
make help - Show all available Make commandsThis project is licensed under the MIT License - see the LICENSE file for details.