Skip to content

A Python tool for splitting Excel workbooks by department - automatically extracts and organizes account data across multiple sheets

License

Notifications You must be signed in to change notification settings

CrueChan/excel-split-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Excel Split Tool

A Python utility for splitting Excel workbooks by department or organizational unit, designed to extract and organize account data across multiple sheets.

Python Version License

Features

  • Multi-sheet Processing: Handles Excel files with multiple worksheets including formal accounts, temporary accounts, public accounts, and public account users
  • Department-based Splitting: Automatically identifies and groups data by organizational departments
  • Flexible Column Detection: Smart column name matching that handles variations in naming and formatting
  • Data Integrity: Maintains relationships between public accounts and their users across split files
  • Batch Processing: Creates separate Excel files for each department with relevant data only
  • Error Handling: Comprehensive validation and error reporting

Supported Sheet Types

The tool is designed to work with Excel files containing these sheet types:

  • 正式账号 (Formal Accounts)
  • 临时账号 (Temporary Accounts)
  • 公用账号 (Public Accounts)
  • 公用账号使用人 (Public Account Users)

Requirements

  • Python 3.12+
  • pandas
  • openpyxl

Installation

Option 1: Using uv (recommended)

git clone https://github.com/CrueChan/excel-split-tool.git
cd excel-split-tool
uv sync

Option 2: Using pip

git clone https://github.com/CrueChan/excel-split-tool.git
cd excel-split-tool
pip install -e .

Option 3: Direct installation

pip install pandas openpyxl

Usage

Basic Usage

from main import split_workbook_by_department

# Split an Excel workbook by department
input_file = 'your-workbook.xlsx'
output_folder = 'output'

split_workbook_by_department(input_file, output_folder)

Command Line Usage

  1. Update the file paths in main.py:
if __name__ == '__main__':
    input_file = 'your-input-file.xlsx'  # Change this to your file
    output_folder = 'output'             # Change this to your output folder
    
    split_workbook_by_department(input_file, output_folder)
  1. Run the script:
python main.py

Input File Format

Your Excel file should contain the following sheets with these key columns:

正式账号 (Formal Accounts)

  • Required column: Contains 门户组织机构第二层级 (Portal Organization Second Level)

临时账号 (Temporary Accounts)

  • Required column: Contains 门户组织机构第二层级 (Portal Organization Second Level)

公用账号 (Public Accounts)

  • Required column: Contains 公用账号所属单位(二) (Public Account Department Level 2)
  • Required column: Contains 账号 (Account)

公用账号使用人 (Public Account Users)

  • Required column: Contains 公用账号 (Public Account)

Output

The tool creates a separate Excel file for each department found in the data:

output/
├── 部门A账号清单.xlsx
├── 部门B账号清单.xlsx
└── 部门C账号清单.xlsx

Each output file contains:

  • 正式账号 sheet with formal accounts for that department
  • 临时账号 sheet with temporary accounts for that department
  • 公用账号 sheet with public accounts for that department
  • 公用账号使用人 sheet with users of those public accounts

File Structure

excel-split-tool/
├── main.py                 # Main script
├── pyproject.toml         # Project configuration
├── README.md              # This file
├── .python-version        # Python version specification
├── uv.lock               # Dependency lock file
├── .gitignore            # Git ignore rules
├── LICENSE               # License file
└── examples/             # Example files (optional)
    └── sample_input.xlsx

Example

# Example usage
from main import split_workbook_by_department

# Process the training center account file
input_file = 'portal-account-audit-form.xlsx'
output_folder = '按部门拆分'

try:
    split_workbook_by_department(input_file, output_folder)
    print("Excel file successfully split by department!")
except ValueError as e:
    print(f"Error: {e}")
except FileNotFoundError:
    print("Input file not found. Please check the file path.")

Error Handling

The tool includes comprehensive error handling for:

  • Missing required columns: Clear error messages indicating which columns are missing
  • Empty or invalid department values: Automatically filters out null/empty department names
  • File I/O errors: Handles file access and permission issues
  • Data type mismatches: Robust handling of different data types

API Reference

split_workbook_by_department(input_file, output_folder)

Splits an Excel workbook into separate files by department.

Parameters:

  • input_file (str): Path to the input Excel file
  • output_folder (str): Path to the output directory (will be created if it doesn't exist)

Raises:

  • ValueError: If required columns are missing from the input file
  • FileNotFoundError: If the input file doesn't exist
  • PermissionError: If there are file access permission issues

Example:

split_workbook_by_department(
    input_file='accounts.xlsx',
    output_folder='output'
)

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

v0.1.0

  • Initial release
  • Basic Excel splitting functionality by department
  • Support for multiple sheet types
  • Smart column detection
  • Comprehensive error handling

Notes

  • The tool preserves original data formatting and structure
  • Only creates output files for departments that have data
  • Automatically handles Chinese column names and content
  • Maintains data relationships between public accounts and their users

About

A Python tool for splitting Excel workbooks by department - automatically extracts and organizes account data across multiple sheets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages