A Spring Boot REST API service for receiving and storing files to various storage backends (filesystem or MinIO).
- Features
- Quick Start
- API Endpoints
- Configuration
- Storage Backends
- Startup Banner
- Building from Source
- Docker Deployment
- Development
- Logging
- Monitoring and Health
- Troubleshooting
- Migration Guide
- RESTful API - Simple HTTP POST endpoint for file uploads
- Multiple Storage Backends - Filesystem or MinIO object storage
- Spring Boot - Production-ready with built-in health checks, metrics, and configuration management
- Flexible Configuration - YAML-based configuration with profile support and environment variable overrides
- Automatic Initialization - Storage backends are initialized automatically on startup
- Custom Banner - Branded ASCII art banner showing application status
- Comprehensive Logging - Configurable logging with multiple levels
java -jar data-lake-receiver-1.0-SNAPSHOT.jarThis will start the server on port 4000 and store files in the ./files directory.
You'll see a custom ASCII banner on startup:
____ _ _ _ ____ _
| _ \ __ _| |_ __ _ | | __ _ | | _____ | _ \ ___ ___ ___(_)_ _____ _ __
| | | |/ _` | __/ _` | | | / _` || |/ / _ \ | |_) / _ \/ __/ _ \ \ \ / / _ \ '__|
| |_| | (_| | || (_| | | |__| (_| || < __/ | _ < __/ (_| __/ |\ V / __/ |
|____/ \__,_|\__\__,_| |_____\__,_||_|\_\___| |_| \_\___|\___\___|_| \_/ \___|_|
:: Spring Boot 3.2.0 ::
:: Application: Data Lake Receiver ::
:: Storage: FILESYSTEM ::
java -jar data-lake-receiver-1.0-SNAPSHOT.jar --spring.profiles.active=minio \
--storage.minio.endpoint=http://localhost:9000 \
--storage.minio.access-key=minioadmin \
--storage.minio.secret-key=minioadminexport SERVER_PORT=8080
export STORAGE_TYPE=MINIO
export STORAGE_MINIO_ENDPOINT=http://localhost:9000
export STORAGE_MINIO_ACCESS_KEY=minioadmin
export STORAGE_MINIO_SECRET_KEY=minioadmin
java -jar data-lake-receiver-1.0-SNAPSHOT.jarThe application supports three ways to specify the filename:
# Upload to specific filename using path
curl -X POST http://localhost:4000/myfile.txt \
--data-binary @localfile.txt
# Upload with subdirectory structure
curl -X POST http://localhost:4000/documents/report.pdf \
--data-binary @report.pdf
# Filename resolution from path
curl -X POST http://localhost:4000/logs/2024/app.log \
--data-binary @app.log# Upload with custom filename via header
curl -X POST http://localhost:4000/ \
-H "X-file-name: myfile.txt" \
--data-binary @localfile.txt# Upload without filename (auto-generated timestamp-based name)
curl -X POST http://localhost:4000/ \
--data-binary @localfile.txtThe application resolves filenames with the following priority:
- Request path - If POST is made to a specific path (e.g.,
/a.txt,/dir/b.json) - X-file-name header - If provided and path is
/ - Auto-generated - Timestamp-based name (e.g.,
1234567890.data)
Examples:
# Priority 1: Path wins
curl -X POST http://localhost:4000/documents/report.pdf \
-H "X-file-name: ignored.txt" \
--data-binary @report.pdf
# Stores as: documents/report.pdf
# Priority 2: Header used when path is /
curl -X POST http://localhost:4000/ \
-H "X-file-name: myfile.txt" \
--data-binary @localfile.txt
# Stores as: myfile.txt
# Priority 3: Auto-generated when neither specified
curl -X POST http://localhost:4000/ \
--data-binary @localfile.txt
# Stores as: 1731831600000.dataResponse:
File stored successfully: myfile.txt
curl http://localhost:4000/healthResponse:
OK - Storage: FileSystem [files]
The application uses a layered configuration approach with the following priority order (highest to lowest):
- Command Line Arguments - Highest priority
- Environment Variables - Override YAML configuration
- application.yml - Default configuration file
- Code Defaults - Fallback values
This means you can set defaults in application.yml and override them with environment variables in production without modifying the JAR file.
| Property | Type | Default | Description |
|---|---|---|---|
server.port |
Integer | 4000 |
HTTP server port |
| Property | Type | Default | Description |
|---|---|---|---|
storage.type |
Enum | filesystem |
Storage backend: FILESYSTEM or MINIO |
| Property | Type | Default | Description |
|---|---|---|---|
storage.filesystem.directory |
String | files |
Directory path where files will be stored |
| Property | Type | Default | Required | Description |
|---|---|---|---|---|
storage.minio.endpoint |
String | - | Yes* | MinIO server URL (e.g., http://localhost:9000) |
storage.minio.access-key |
String | - | Yes* | MinIO access key for authentication |
storage.minio.secret-key |
String | - | Yes* | MinIO secret key for authentication |
storage.minio.bucket-name |
String | data-lake |
No | Bucket name where files will be stored |
*Required when storage.type=minio
The main configuration file is src/main/resources/application.yml:
server:
port: 4000
spring:
application:
name: data-lake-receiver
storage:
type: filesystem # or minio
filesystem:
directory: files
minio:
endpoint: http://localhost:9000
access-key: minioadmin
secret-key: minioadmin
bucket-name: data-lakeUses filesystem storage with default settings.
java -jar data-lake-receiver-1.0-SNAPSHOT.jarReduces logging verbosity for production environments.
java -jar data-lake-receiver-1.0-SNAPSHOT.jar --spring.profiles.active=productionAutomatically switches to MinIO storage backend.
java -jar data-lake-receiver-1.0-SNAPSHOT.jar --spring.profiles.active=minioEdit src/main/resources/application.yml:
server:
port: 4000
storage:
type: filesystem
filesystem:
directory: ./data/filesRun:
java -jar data-lake-receiver-1.0-SNAPSHOT.jarexport SERVER_PORT=8080
export STORAGE_TYPE=MINIO
export STORAGE_MINIO_ENDPOINT=http://localhost:9000
export STORAGE_MINIO_ACCESS_KEY=minioadmin
export STORAGE_MINIO_SECRET_KEY=minioadmin
export STORAGE_MINIO_BUCKET_NAME=production-data
java -jar data-lake-receiver-1.0-SNAPSHOT.jarapplication.yml (development defaults):
server:
port: 4000
storage:
type: filesystem
filesystem:
directory: ./dev-filesProduction (override with environment variables):
export STORAGE_TYPE=MINIO
export STORAGE_MINIO_ENDPOINT=https://minio.prod.com
export STORAGE_MINIO_ACCESS_KEY=${PROD_ACCESS_KEY}
export STORAGE_MINIO_SECRET_KEY=${PROD_SECRET_KEY}
java -jar data-lake-receiver-1.0-SNAPSHOT.jarStores files to the local filesystem directory.
- Simple - No external dependencies
- Fast - Low latency for local operations
- Easy Setup - Just specify a directory path
storage:
type: filesystem
filesystem:
directory: /data/uploadsOr via environment:
export STORAGE_TYPE=FILESYSTEM
export STORAGE_FILESYSTEM_DIRECTORY=/data/uploadsdocker run -d \
-p 4000:4000 \
-e STORAGE_TYPE=FILESYSTEM \
-e STORAGE_FILESYSTEM_DIRECTORY=/data/files \
-v /host/path:/data/files \
data-lake-receiver:latestPros:
- Low latency
- Simple deployment
- No external dependencies
Cons:
- Limited scalability
- Single point of failure
- Not suitable for distributed systems
Stores files to a MinIO object storage bucket (S3-compatible).
- Scalable - Distributed object storage
- High Availability - Built-in redundancy
- S3-Compatible - Works with any S3-compatible storage
- Auto-Bucket Creation - Creates bucket if it doesn't exist
storage:
type: minio
minio:
endpoint: http://localhost:9000
access-key: minioadmin
secret-key: minioadmin
bucket-name: data-lakeOr via environment:
export STORAGE_TYPE=MINIO
export STORAGE_MINIO_ENDPOINT=http://localhost:9000
export STORAGE_MINIO_ACCESS_KEY=minioadmin
export STORAGE_MINIO_SECRET_KEY=minioadmin
export STORAGE_MINIO_BUCKET_NAME=data-lakeversion: '3.8'
services:
minio:
image: minio/minio:latest
ports:
- "9000:9000"
- "9001:9001"
environment:
- MINIO_ROOT_USER=minioadmin
- MINIO_ROOT_PASSWORD=minioadmin
command: server /data --console-address ":9001"
volumes:
- minio-data:/data
data-lake-receiver:
image: data-lake-receiver:latest
ports:
- "4000:4000"
environment:
- STORAGE_TYPE=MINIO
- STORAGE_MINIO_ENDPOINT=http://minio:9000
- STORAGE_MINIO_ACCESS_KEY=minioadmin
- STORAGE_MINIO_SECRET_KEY=minioadmin
- STORAGE_MINIO_BUCKET_NAME=data-lake
depends_on:
- minio
volumes:
minio-data:Run with:
docker-compose up -dPros:
- Scalable and distributed
- S3-compatible
- High availability
- Suitable for production/cloud deployments
Cons:
- Network overhead
- Requires additional infrastructure
- More complex setup
- Use strong access keys and secret keys
- Enable HTTPS/TLS for production deployments
- Implement bucket policies and access controls
- Rotate credentials regularly
- Use network policies to restrict MinIO access
The application displays a custom ASCII art banner on startup with dynamic information:
____ _ _ _ ____ _
| _ \ __ _| |_ __ _ | | __ _ | | _____ | _ \ ___ ___ ___(_)_ _____ _ __
| | | |/ _` | __/ _` | | | / _` || |/ / _ \ | |_) / _ \/ __/ _ \ \ \ / / _ \ '__|
| |_| | (_| | || (_| | | |__| (_| || < __/ | _ < __/ (_| __/ |\ V / __/ |
|____/ \__,_|\__\__,_| |_____\__,_||_|\_\___| |_| \_\___|\___\___|_| \_/ \___|_|
:: Spring Boot 3.2.0 ::
:: Application: Data Lake Receiver ::
:: Version: 1.0-SNAPSHOT ::
:: Storage: FILESYSTEM ::
===================================================================================
Ready to receive and store files - Send POST requests to http://localhost:4000/
===================================================================================
java -jar data-lake-receiver-1.0-SNAPSHOT.jar --spring.main.banner-mode=offOr in application.yml:
spring:
main:
banner-mode: offEdit src/main/resources/banner.txt:
${AnsiColor.BRIGHT_CYAN}
____ _ _ _ ____ _
| _ \ __ _| |_ __ _ | | __ _ | | _____ | _ \ ___ ___ ___(_)_ _____ _ __
${AnsiColor.DEFAULT}
${AnsiColor.BRIGHT_GREEN} :: Spring Boot ${spring-boot.version} ::${AnsiColor.DEFAULT}
| Placeholder | Description | Example |
|---|---|---|
${spring-boot.version} |
Spring Boot version | 3.2.0 |
${application.title} |
Application title | Data Lake Receiver |
${application.version} |
Application version | 1.0-SNAPSHOT |
${server.port} |
Server port | 4000 |
${storage.type} |
Storage backend type | FILESYSTEM |
- Java 17 or higher
- Maven 3.6 or higher
mvn clean packageThe executable JAR will be created at target/data-lake-receiver-1.0-SNAPSHOT.jar.
mvn testPre-built images are automatically published to GitHub Container Registry on every release:
# Pull and run latest version
docker pull ghcr.io/sparkworks/data-lake-receiver:latest
docker run -p 4000:4000 ghcr.io/sparkworks/data-lake-receiver:latest
# Or use in docker-composeUpdate docker-compose.yml to use the pre-built image:
services:
data-lake-receiver:
image: ghcr.io/sparkworks/data-lake-receiver:latest # Use pre-built image
ports:
- "4000:4000"Available images:
ghcr.io/sparkworks/datalake-receiver:latest- Latest stable releaseghcr.io/sparkworks/datalake-receiver:1.0.0- Specific versionghcr.io/sparkworks/datalake-receiver:develop- Development branch
See GITHUB_REGISTRY.md for complete details on pulling images and authentication.
If you need to build the image locally:
# Build using traditional Dockerfile
docker build -t data-lake-receiver:latest .
# Or using Spring Boot Maven plugin
mvn spring-boot:build-image# Start the service
docker-compose -f docker-compose-filesystem.yml up -dWhat you get:
- Data Lake Receiver on port 4000
- Files stored in
./data/files/
# Start services (MinIO + Data Lake Receiver)
docker-compose -f docker-compose-minio.yml up -dWhat you get:
- MinIO server on ports 9000 (API) and 9001 (Console)
- Data Lake Receiver on port 4000
- Files stored in MinIO bucket
- MinIO Console at http://localhost:9001 (login: minioadmin/minioadmin)
# Build the image first
mvn spring-boot:build-image
# Start the service
docker-compose -f docker-compose-filesystem.yml up -d
# View logs
docker-compose -f docker-compose-filesystem.yml logs -f
# Stop and remove
docker-compose -f docker-compose-filesystem.yml down
# Upload a file
curl -X POST http://localhost:4000/ \
-H "X-file-name: test.txt" \
--data-binary @myfile.txtSee DOCKER.md for:
- Complete setup instructions
- Configuration options
- Troubleshooting guide
- Production deployment tips
- Networking architecture
src/main/java/net/sparkworks/ac3/logger/
├── DataLakeReceiverApplication.java # Spring Boot main class
├── config/
│ ├── StorageConfiguration.java # Bean configuration
│ └── StorageProperties.java # Configuration properties
├── controller/
│ └── FileReceiverController.java # REST controller
└── storage/
├── StorageProvider.java # Interface
├── FileSystemStorageProvider.java # Filesystem implementation
└── MinIOStorageProvider.java # MinIO implementation
- Create a class implementing
StorageProvider
public class S3StorageProvider implements StorageProvider {
@Override
public void initialize() throws IOException {
// Initialize S3 client
}
@Override
public void store(String filename, byte[] data) throws IOException {
// Upload to S3
}
@Override
public String getName() {
return "AWS S3";
}
}- Add configuration to
StorageProperties
@Data
public static class S3Config {
private String bucket;
private String region;
private String accessKey;
private String secretKey;
}- Update
StorageConfigurationto create the bean
case S3 -> createS3Provider(properties.getS3());- Add new storage type to the enum
public enum StorageType {
FILESYSTEM,
MINIO,
S3
}Logging is configured in application.yml:
logging:
level:
root: INFO
net.sparkworks.ac3.logger: DEBUG
pattern:
console: "%d{yyyy-MM-dd HH:mm:ss} - %msg%n"Via environment variables:
export LOGGING_LEVEL_ROOT=WARN
export LOGGING_LEVEL_NET_SPARKWORKS_AC3_LOGGER=DEBUGVia command line:
java -jar data-lake-receiver-1.0-SNAPSHOT.jar \
--logging.level.root=WARN \
--logging.level.net.sparkworks.ac3.logger=DEBUGThe application logs:
- Configuration on startup
- Storage provider initialization
- File upload requests with headers
- Storage operations (success/failure)
- Any errors with stack traces
Example output:
2025-11-16 15:45:23 - Starting DataLakeReceiverApplication...
2025-11-16 15:45:24 - Creating storage provider of type: FILESYSTEM
2025-11-16 15:45:24 - FileSystem storage initialized at: files
2025-11-16 15:45:25 - Started DataLakeReceiverApplication in 2.543 seconds
2025-11-16 15:45:30 - header[X-file-name]=[test.txt]
2025-11-16 15:45:30 - Storing file with filename: test.txt (1024 bytes)
2025-11-16 15:45:30 - Successfully stored file: test.txt (1024 bytes)
curl http://localhost:4000/healthResponse shows storage backend status:
OK - Storage: FileSystem [files]
Or for MinIO:
OK - Storage: MinIO [bucket: data-lake]
You can enable additional actuator endpoints by adding to pom.xml:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>And configuring in application.yml:
management:
endpoints:
web:
exposure:
include: health,info,metricsCheck logs for configuration errors:
java -jar data-lake-receiver-1.0-SNAPSHOT.jar --debugCommon issues:
- Invalid MinIO credentials
- Missing required configuration
- Port already in use
Filesystem Storage:
- Check directory permissions
- Verify directory path exists or can be created
- Check available disk space
MinIO Storage:
- Verify MinIO server is running and accessible
- Check endpoint URL is correct
- Verify access key and secret key are correct
- Check network connectivity
Change the port:
java -jar data-lake-receiver-1.0-SNAPSHOT.jar --server.port=8080Or set environment variable:
export SERVER_PORT=8080Check environment variables:
printenv | grep -E '(SERVER_PORT|STORAGE)'Verify YAML syntax:
- Use spaces, not tabs
- Check indentation
- Ensure property names match exactly
Enable debug logging:
java -jar data-lake-receiver-1.0-SNAPSHOT.jar \
--logging.level.org.springframework.boot=DEBUGTest MinIO connectivity:
curl http://localhost:9000/minio/health/liveCheck MinIO logs:
docker logs minio-container-nameVerify credentials:
# Try accessing MinIO console
open http://localhost:9001The Spring Boot version maintains API compatibility with the previous version.
✅ Same Endpoints:
- POST
/- Upload files - Same
X-file-nameheader - Same response behavior
Old (environment variables):
export HTTP_SERVER_PORT=4000
export STORAGE_TYPE=filesystem
export FILES_DIR=filesNew (recommended):
export SERVER_PORT=4000
export STORAGE_TYPE=FILESYSTEM
export STORAGE_FILESYSTEM_DIRECTORY=filesOld (application.properties):
HTTP_SERVER_PORT=4000
STORAGE_TYPE=filesystem
FILES_DIR=filesNew (application.yml):
server:
port: 4000
storage:
type: filesystem
filesystem:
directory: files| Feature | Old Version | New Version |
|---|---|---|
| Web Server | com.sun.httpserver |
Spring Boot / Embedded Tomcat |
| Configuration | Custom loader | Spring Boot Configuration |
| DI | Manual | Spring Framework |
| API | HTTP Handler | Spring REST Controller |
| Monitoring | None | Spring Actuator |
| Logging | Logback | Logback + Spring Boot |
Apache License 2.0
For issues, questions, or contributions, please refer to the project repository.