- Introduction
- Architecture
- Installation Guide
- Usage Guide
- API Reference
- Algorithm Details
- Docker Environment
- Development Guide
- Troubleshooting
- Future Roadmap
Data Art is an experimental project that explores the intersection of data visualization and digital art. By transforming textual information into visual patterns, we create unique "data fingerprints" that represent the underlying content in an aesthetically interesting way.
- Digital Art: Create unique artwork from favorite quotes, poems, or texts
- Data Visualization: Visualize text patterns and structures
- Security: Visual representation of text for quick comparison (visual hashing)
- Education: Teaching concepts of ASCII, color theory, and data representation
┌─────────────────────┐
│ Input Text │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ Text Preprocessor │ (Future: NLTK integration)
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ ASCII Converter │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ RGB Mapper │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ Image Generator │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ PNG Output │
└─────────────────────┘
- Language: Python 3
- Image Processing: Pillow (PIL Fork)
- NLP: NLTK (Natural Language Toolkit)
- Containerization: Docker
- Version Control: Git
- Docker Desktop (Windows/Mac) or Docker Engine (Linux)
- At least 1GB free disk space
- Git
-
Clone the repository
git clone <repository-url> cd data-art
-
Navigate to text directory
cd text -
Build Docker image
chmod +x build.sh ./build.sh
Expected output:
[+] Building 23.5s (8/8) FINISHED => [internal] load build definition from Dockerfile => => transferring dockerfile: 297B ... => => naming to docker.io/library/python-data-art -
Verify installation
docker images | grep python-data-art
- Python 3.6 or higher
- pip package manager
-
Clone the repository
git clone <repository-url> cd data-art/text
-
Create virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt pip install Pillow python -m nltk.downloader wordnet
-
Start container
./run.sh
-
Inside container, generate art
python pil.py
-
Check output
ls -la putPixel.png
-
Exit container
exit
cd text
python pil.pyEdit pil.py and modify the text variable:
text = "Your custom text here. The longer the text, the larger the image!"Create a script to process multiple texts:
import os
from pil import create_image_from_text # Requires refactoring pil.py
texts = [
("quote1.txt", "To be or not to be"),
("quote2.txt", "Hello, World!"),
]
for filename, content in texts:
create_image_from_text(content, f"{filename}.png")Calculate image dimensions for given text (currently unused).
Parameters:
text(str): Input text to process
Returns:
- None (prints dimensions)
class TextToArt:
def __init__(self, text, preprocessing=True):
"""Initialize with text and optional preprocessing."""
pass
def remove_stopwords(self):
"""Remove common words like 'the', 'is', etc."""
pass
def lemmatize(self):
"""Reduce words to their base form."""
pass
def to_image(self, output_path, dimensions=None):
"""Convert text to image and save."""
pass-
Dimension Calculation
height = int(math.sqrt(len(text))) / 3 width = height
Creates a square image based on text length.
-
Color Mapping
- Characters are processed in groups of 3
- Each character's ASCII value (0-255) becomes a color component
- Example: "ABC" → (65, 66, 67) → Medium gray pixel
-
Pixel Placement
- Pixels are placed left-to-right, top-to-bottom
- Excess characters are truncated
- Missing characters would result in black pixels (0,0,0)
- Lowercase letters: ASCII 97-122 (medium-high values)
- Uppercase letters: ASCII 65-90 (medium values)
- Numbers: ASCII 48-57 (low values)
- Spaces: ASCII 32 (very low value)
- Special characters: Various ranges
This creates natural color patterns where:
- Text with many spaces appears darker
- Uppercase text appears slightly darker than lowercase
- Numbers create dark regions
- Special characters add color variety
FROM python:3 # Base Python 3 image
WORKDIR /usr/src/app # Setup directory
COPY requirements.txt ./ # Copy dependencies
RUN pip install --no-cache-dir -r requirements.txt # Install packages
RUN python -m nltk.downloader -d /usr/local/share/nltk_data wordnet # NLTK data
WORKDIR /opt/work # Working directoryThe run.sh script mounts current directory:
-v $(pwd):/opt/workThis allows:
- Live code editing
- Persistent output files
- No need to rebuild for code changes
--rm: Automatically remove container after exit-it: Interactive terminal- No persistent data inside container
- Follow PEP 8 guidelines
- Use descriptive variable names for clarity
- Add docstrings to all functions
- Comment complex algorithms
Currently no formal tests. Recommended approach:
def test_ascii_conversion():
assert ord('A') == 65
assert ord('a') == 97
assert ord(' ') == 32
def test_color_range():
for char in "Sample Text":
assert 0 <= ord(char) <= 255- Fork the repository
- Create feature branch:
git checkout -b feature-name - Make changes and test
- Commit with clear messages
- Push and create pull request
Example: Adding color schemes
class ColorScheme:
def __init__(self, name):
self.name = name
def map_char_to_color(self, char):
"""Override in subclasses"""
raise NotImplementedError
class GrayscaleScheme(ColorScheme):
def map_char_to_color(self, char):
val = ord(char)
return (val, val, val)
class InvertedScheme(ColorScheme):
def map_char_to_color(self, char):
val = 255 - ord(char)
return (val, val, val)Error: Cannot connect to Docker daemon
Solution: Ensure Docker is running
sudo systemctl start docker # Linux
# Or start Docker Desktop on Windows/MacError: Permission denied: ./build.sh
Solution: Make scripts executable
chmod +x build.sh run.shError: ValueError: width and height must be > 0
Solution: Ensure text is long enough (at least 9 characters)
Error: ModuleNotFoundError: No module named 'PIL'
Solution: Install Pillow
pip install PillowError: LookupError: Resource wordnet not found
Solution: Download NLTK data
python -m nltk.downloader wordnetFor large texts (>10,000 characters):
- Image generation may be slow
- Consider chunking text
- Use lower resolution by modifying dimension calculation
-
Natural Language Processing
- Stop word removal
- Lemmatization
- Named entity recognition for special coloring
-
Enhanced Visualization
- Multiple color schemes
- Non-square image shapes
- Animation support (GIF output)
-
User Interface
- Command-line arguments
- Web interface with Flask/Django
- Real-time preview
-
Advanced Algorithms
- Word frequency-based coloring
- Sentiment analysis coloring
- Topic modeling visualization
-
Output Formats
- SVG for scalable graphics
- PDF for print quality
- Video for text animations
- Machine learning-based color mapping
- 3D visualizations
- Interactive exploring of text regions
- Collaborative art creation
- API service for text-to-art generation
Key ranges for color mapping:
- 0-31: Control characters (avoid)
- 32-47: Space and symbols
- 48-57: Numbers (0-9)
- 58-64: More symbols
- 65-90: Uppercase letters (A-Z)
- 91-96: More symbols
- 97-122: Lowercase letters (a-z)
- 123-127: More symbols
- 128-255: Extended ASCII
Different text types produce different visual patterns:
- Poetry: Often balanced, rhythmic patterns
- Code: High contrast with many symbols
- Prose: Smooth, flowing color gradients
- Data: Repetitive patterns based on structure