Rocio.data tuni56

Hi, I’m Rocío 👋

I’m a Data Engineer with experience building data pipelines, streaming systems, and ML-driven analytics using Python and AWS.

I come from a traditional engineering background and transitioned into data engineering by building real systems end to end — not just notebooks or tutorials. I focus on making data usable, scalable, and reliable, especially in environments where resources, time, or cloud budgets are limited.

What I Work On

I design and implement systems that:

Ingest and process data reliably
Support analytics and machine learning use cases
Follow cloud-native and event-driven architecture principles
Are understandable and maintainable by teams

My work sits at the intersection of data engineering, backend systems, and applied machine learning.

Selected Projects

🛒 Ecommerce Streaming Data Platform

Real-time, event-driven architecture

Built a streaming data platform using Kafka
Event-driven ingestion and processing
Routing concepts inspired by AWS Route 53
Observability with Grafana
Focus on data flow, reliability, and monitoring

Tech: Python, Kafka, Event-driven architecture, Grafana
→ Repository: ecommerce-streaming-data-platform

🗄️ Data Lake Analytics Pipeline

Batch ingestion and analytics

End-to-end data ingestion and processing pipeline
Structured data lake layout
Designed for analytics and reporting use cases
Emphasis on automation and data quality checks

Tech: Python, Data pipelines
→ Repository: datalake-analytics-pipeline

📉 Customer Churn Prediction

Machine learning on AWS

Customer churn prediction workflow
Training and evaluation using AWS SageMaker
End-to-end ML lifecycle: data prep → training → evaluation
Focus on deployable, reproducible ML workflows

Tech: Python, AWS SageMaker, Machine Learning
→ Repository: customer-churn-prediction

📦 Demand Forecasting System

Predictive analytics for inventory management

Time-series forecasting for demand prediction
Feature engineering and model training pipeline
Designed to support business decision-making

Tech: Python, Forecasting models
→ Repository: demand-forecasting-system

📻 Radio Station Microservices Platform

Distributed systems & event-driven backend (AWS-style simulation)

Microservices-based backend system for a radio station
Event-driven communication using Kafka
Service coordination with ZooKeeper
Built with Java, Spring Boot, and Spring Cloud
Architecture designed to simulate AWS-managed services locally for learning and cost efficiency

Focus: Distributed systems design, messaging, service discovery
Tech: Java, Spring Boot, Spring Cloud, Kafka, ZooKeeper

Tech Stack

Data Engineering

Python, SQL
Kafka
Batch & streaming pipelines
Data modeling and data flow design

Cloud & Infrastructure

AWS (S3, Lambda, DynamoDB, SageMaker, API Gateway)
Infrastructure as Code: Terraform (hands-on learning and application)
IAM & least-privilege design

Machine Learning

scikit-learn
Time-series forecasting
ML pipelines and experimentation
SageMaker workflows

Backend & Systems

Java
Spring Boot, Spring Cloud
Microservices architecture
Event-driven systems

What I’m Focusing On Now

Designing AWS-native architectures
Infrastructure as Code with Terraform
Improving observability and system design
Preparing for Data Engineer / Data Platform Engineer roles

Let’s Connect

💼 LinkedIn: https://www.linkedin.com/in/rociobaigorria/
📧 Email: rociomnbaigorria@gmail.com
🌍 Location: Argentina (GMT-3) — open to remote opportunities

“Making data accessible to people who actually need to use it.”

Pinned Repositories

ecommerce-streaming-data-platform
datalake-analytics-pipeline
customer-churn-prediction
demand-forecasting-system

Provide feedback

Saved searches

Use saved searches to filter your results more quickly