A fast command-line tool for peeking at tabular data files (CSV, TSV, Parquet, Feather, Excel) with concise, chainable options inspired by Unix tools like ls.
Install from PyPI (recommended):
pip install dfpeekOr, for Excel/Parquet support:
pip install dfpeek[excel,parquet]Run from the command line:
dfpeek <datafile> [options]| Option | Description |
|---|---|
-f FORMAT |
Force file format (csv, tsv, excel, parquet, feather) |
-d DELIM |
Set delimiter for CSV/TSV files (e.g., , or \t) |
-xs N |
Select Excel sheet N (1-based indexing) |
-xr N |
Skip first N rows in Excel files |
-H N |
Show first N rows |
-T N |
Show last N rows |
-R START END |
Show rows in range START to END (zero-based, END excl.) |
-L EXPR |
Perform df.loc[expression] for flexible row/column selection |
-I EXPR |
Perform df.iloc[expression] for position-based selection |
-u COL |
Show unique values for column COL |
-c COL |
Show info about column COL (type, nulls, etc.) |
-v COL |
Show value counts for column COL |
-s COL |
Show stats for numerical column COL |
-l |
List column names |
-i |
Show file info (rows, columns, memory usage) |
All options can be chained in any order.
Show first 10 rows:
dfpeek data.feather -H 10Show last 5 rows:
dfpeek data.feather -T 5Show rows 20 to 30:
dfpeek data.feather -R 20 30Show unique values for column city:
dfpeek data.feather -u cityShow info about column city:
dfpeek data.feather -c cityShow value counts for column status:
dfpeek data.feather -v statusShow stats for column age:
dfpeek data.feather -s ageList columns:
dfpeek data.feather -lShow file info:
dfpeek data.feather -iUse loc for label-based selection:
# Rows only
dfpeek data.feather -L "0:5" # First 5 rows
dfpeek data.feather -L "df.age > 30" # Rows where age > 30
# Columns only
dfpeek data.feather -L ":, 'name'" # All rows, name column
dfpeek data.feather -L ":, ['name', 'age']" # All rows, name and age columns
# Both rows and columns
dfpeek data.feather -L "0:5, 'name':'city'" # First 5 rows, name to city columns
dfpeek data.feather -L "df.age > 25, ['name', 'status']" # Age > 25, name and status columnsUse iloc for position-based selection:
# Rows only
dfpeek data.feather -I "0:5" # First 5 rows
dfpeek data.feather -I "[0,2,4]" # Rows at positions 0, 2, 4
# Columns only
dfpeek data.feather -I ":, 0" # All rows, first column
dfpeek data.feather -I ":, [0,2]" # All rows, columns 0 and 2
# Both rows and columns
dfpeek data.feather -I "0:5, 0:3" # First 5 rows, first 3 columns
dfpeek data.feather -I "[0,2,4], [1,3]" # Specific rows and columnsForce CSV format for files without .csv extension:
dfpeek mydata.txt -f csvUse custom delimiter:
dfpeek data.tsv -d "\t" -H 5Show info and first 5 rows (default if no options):
dfpeek data.featherUse a specific Excel sheet (e.g., the 3rd sheet):
dfpeek data.xlsx -xs 3 -H 10Skip the first 2 rows of an excel file:
dfpeek data.xlsx -xr 2 - CSV (.csv)
- TSV (.tsv)
- Parquet (.parquet)
- Feather (.feather)
- Excel (.xlsx)
- For very large files, output may be slow if printing many rows.
- All rows/columns are shown in full (no abbreviation).
- Requires Python 3.7+
MIT