New pages: - Home: Redesigned with hero, impact stats, featured project - About: 6-section professional narrative - Projects: Hub with 4 project cards and status badges - Resume: Inline display with download placeholders - Contact: Form UI (disabled) with contact info - Blog: Markdown-based system with frontmatter support Infrastructure: - Blog system with markdown loader (python-frontmatter, markdown, pygments) - Sidebar callback for active state highlighting on navigation - Separated navigation into main pages and projects/dashboards groups Closes #36, #37, #38, #39, #40, #41, #42, #43 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
3.8 KiB
title, date, description, tags, status
| title | date | description | tags | status | |||
|---|---|---|---|---|---|---|---|
| Building a Data Platform as a Team of One | 2025-01-15 | What I learned from 5 years as the sole data professional at a mid-size company |
|
published |
When I joined Summitt Energy in 2019, there was no data infrastructure. No warehouse. No pipelines. No documentation. Just a collection of spreadsheets and a Genesys Cloud instance spitting out CSVs.
Five years later, I'd built DataFlow: an enterprise platform processing 1B+ rows across 21 tables, feeding dashboards that executives actually opened. Here's what I learned doing it alone.
The Reality of "Full Stack Data"
When you're the only data person, "full stack" isn't a buzzword—it's survival. In a single week, I might:
- Debug a Python ETL script at 7am because overnight loads failed
- Present quarterly metrics to leadership at 10am
- Design a new dimensional model over lunch
- Write SQL transformations in the afternoon
- Handle ad-hoc "can you pull this data?" requests between meetings
There's no handoff. No "that's not my job." Everything is your job.
Prioritization Frameworks
The hardest part isn't the technical work—it's deciding what to build first when everything feels urgent.
The 80/20 Rule, Applied Ruthlessly
I asked myself: What 20% of the data drives 80% of decisions?
For a contact center, that turned out to be:
- Call volume by interval
- Abandon rate
- Average handle time
- Service level
Everything else was nice-to-have. I built those four metrics first, got them bulletproof, then expanded.
The "Who's Screaming?" Test
When multiple stakeholders want different things:
- Who has executive backing?
- What's blocking revenue?
- What's causing visible pain?
If nobody's screaming, it can probably wait.
Technical Debt vs. Shipping
I rewrote DataFlow three times:
- v1 (2020): Hacky Python scripts. Worked, barely.
- v2 (2021): Proper dimensional model. Still messy code.
- v3 (2022): SQLAlchemy ORM, proper error handling, logging.
- v4 (2023): dbt-style transformations, FastAPI layer.
Was v1 embarrassing? Yes. Did it work? Also yes.
The lesson: Ship something that works, then iterate. Perfect is the enemy of done, especially when you're alone.
Building Stakeholder Trust
The technical work is maybe 40% of the job. The rest is politics.
Quick Wins First
Before asking for resources or patience, I delivered:
- Automated a weekly report that took someone 4 hours
- Fixed a dashboard that had been wrong for months
- Built a simple tool that answered a frequent question
Trust is earned in small deposits.
Speak Their Language
Executives don't care about your star schema. They care about:
- "This will save 10 hours/week"
- "This will catch errors before they hit customers"
- "This will let you see X in real-time"
Translate technical work into business outcomes.
What I'd Do Differently
-
Document earlier. I waited too long. When I finally wrote things down, onboarding became possible.
-
Say no more. Every "yes" to an ad-hoc request is a "no" to infrastructure work. Guard your time.
-
Build monitoring first. I spent too many mornings discovering failures manually. Alerting should be table stakes.
-
Version control everything. Even SQL. Even documentation. If it's not in Git, it doesn't exist.
The Upside
Being a team of one forced me to learn things I'd have specialized away from on a bigger team:
- Data modeling
- Pipeline architecture
- Dashboard design
- Stakeholder management
- System administration
It's brutal, but it makes you dangerous. You understand the whole stack.
This is part of a series on building data infrastructure at small companies. More posts coming on dimensional modeling, dbt patterns, and surviving legacy systems.