I Accidentally Built a Modern Data Platform Years Before It Became a Trend

“કેડ માં છોકરું, ને ગામ માં ઢંઢેરો”

Basically: built something quietly, realised later it was a big deal.

TL;DR — Years ago, I built a simple pricing system: raw tables in, a stored procedure in the middle, and one final “correct price” table out. At the time, I didn’t know anything about terms like data products, Foundational Data Products (FDPs), a Derived Data Product (DDP), data lineage, or data contracts. I was just trying to make the website show the right price.

Much later, when I started learning modern data engineering, I realised I had already built many of those concepts without knowing their names. This post is me looking back, connecting the dots, and understanding those ideas through something I actually built in real life.


Why I’m Writing This

I’m writing this mainly for myself. I’ve been trying to understand modern data engineering concepts, and sometimes the wording makes everything sound more complex than it is. When I step back and compare the concepts to my real experience, I realise I actually lived through many of them already — I just didn’t have the terminology.

This blog helps me learn by linking theory to something real I built. If someone else learns from it too, great.


The System I Built (Before I Had the Words)

Years ago, I built a pricing system for an e‑commerce site. The business side (marketing/admin) would enter:

  • Product‑level prices
  • Variant‑level prices
  • Start and end dates

I stored that data in two simple raw tables. Then I wrote a stored procedure that:

  • cleaned and normalised the input,
  • applied pricing rules,
  • merged everything into one consistent truth,
  • fixed date issues,
  • and finally wrote out one table that the frontend could trust.

The rules were practical and simple:

  • If a variant had its own price, use it.
  • Otherwise, use the product price.
  • Respect time ranges.
  • Don’t leave gaps or overlaps.
  • Don’t produce duplicates.

I didn’t think about “architecture.” I wasn’t trying to implement patterns. I just wanted a clean and reliable answer for the frontend.

Back then, I called it:

“the final price table.”

Nothing more.


The Same System, With Modern Labels

Later, when I started exploring modern data engineering — especially the vocabulary — things clicked.

My old setup actually lined up with many concepts people talk about today.

Here’s the mapping using generic table names:

  • Raw inputs: price_raw_product, price_raw_variant
  • Foundational Data Products (FDPs): cleaned versions of the raw inputs (price_fdp_product, price_fdp_variant)
  • Derived Data Product (DDP): the final truth table (price_ddp_effective)
  • Orchestration: my stored procedure
  • Consumers: frontend, checkout, reports, etc.
Admin / Marketing (price inputs)
           |
           v
Raw Input Tables
┌───────────────────────┐
│ price_raw_product     │
│ price_raw_variant     │
└──────────┬────────────┘
           v
Foundational Data Products (FDP)
┌───────────────────────────────┐
│ price_fdp_product             │
│ price_fdp_variant             │
└──────────┬────────────────────┘
           v
Orchestration (Stored Procedure)
┌───────────────────────────────┐
│ joins, precedence, date logic │
└──────────┬────────────────────┘
           v
Derived Data Product (DDP)
┌───────────────────────────────┐
│ price_ddp_effective           │
└──────────┬────────────────────┘
           v
Frontend / APIs / Checkout

Mapping My Experience to Data Engineering Concepts

This is the part that surprised me the most. Reading about “data products,” “data mesh,” “lineage,” and “contracts” suddenly felt familiar. I had unintentionally built small versions of these ideas.

1) Data Product

A data product is basically a dataset with a clear purpose, rules, and an owner.

That final table I produced fulfilled exactly that purpose:

“Give me the correct price for any product or variant, at any time.”

It had a consistent schema, the whole system depended on it, and I was the owner. So yes — it was a data product even if I never used the term.


2) Foundational vs. Derived Data Products

  • My cleaned tables (price_fdp_product, price_fdp_variant) were FDPs — stable inputs.
  • My final table (price_ddp_effective) was the DDP — the business‑logic‑applied truth.

I didn’t design it like that intentionally. But the structure naturally appeared because it was practical.


3) Orchestration

Today people use workflow tools, but my orchestration was simply:

  • A stored procedure,
  • Triggered by a cron job,
  • Running steps in a controlled order.

It cleaned, merged, validated, and published the final result.

Simple. And dependable.


4) Data Lineage

I didn't draw lineage diagrams back then, but the flow was clear.

price_raw_* → price_fdp_* → price_ddp_effective → frontend

When something was wrong, I always knew where to look:

  • If it was a business logic issue → the DDP.
  • If it was messy input → the raw tables.
  • If it was formatting → the FDP layer.

That’s lineage in practice.


5) Data Contract

No one told me to write a “data contract.” But I still had an internal understanding:

  • the columns needed to exist,
  • dates had to be valid,
  • variant overrides product,
  • the table needed to refresh with cron.

It wasn’t formal, but the “contract” existed through consistent behaviour.


The Core Business Logic (Short & Simple)

Here’s the simplified idea of what the stored procedure did, without the exact code.

-- Normalize raw inputs into FDPs
CREATE TABLE price_fdp_product AS
SELECT product_id,
       price,
       GREATEST(start_date, DATE '2000-01-01') AS start_date,
       COALESCE(end_date, DATE '2999-12-31')   AS end_date
FROM price_raw_product
WHERE price IS NOT NULL;

CREATE TABLE price_fdp_variant AS
SELECT product_id, variant_id,
       price,
       GREATEST(start_date, DATE '2000-01-01') AS start_date,
       COALESCE(end_date, DATE '2999-12-31')   AS end_date
FROM price_raw_variant
WHERE price IS NOT NULL;

-- Materialize the DDP: variant price overrides product price for the same period
CREATE TABLE price_ddp_effective AS
WITH unioned AS (
  SELECT product_id, variant_id, price, start_date, end_date, 2 AS precedence
  FROM price_fdp_variant
  UNION ALL
  SELECT product_id, NULL AS variant_id, price, start_date, end_date, 1 AS precedence
  FROM price_fdp_product
),
resolved AS (
  SELECT *
  FROM unioned
  QUALIFY ROW_NUMBER() OVER (
    PARTITION BY product_id, COALESCE(variant_id, -1), start_date, end_date
    ORDER BY precedence DESC
  ) = 1
)
SELECT product_id, variant_id,
       price AS effective_price,
       start_date, end_date,
       CASE WHEN variant_id IS NOT NULL THEN 'variant' ELSE 'product' END AS price_source
FROM resolved;

Not fancy.

Just practical.

In reality, I also handled edge cases (date overlaps, gaps, rounding rules, and invalid inputs).


Scheduling & Freshness

The entire pipeline refreshed through a cron job triggered via command line function and some database triggers. This could have been improved but when simple things solves the purpose, no need make it complicated with modern flows.

No special tools.

No workflow orchestrators.

And honestly, it worked extremely well.


Why This Whole Thing Worked

Looking back, the success wasn’t about technology. It was about clarity:

  • Inputs were clean.
  • Logic lived in one place.
  • Output was predictable.
  • Everyone trusted the result.

That trust mattered more than the fancy names.


How I Would Describe It Today

If I had to describe the same system in “modern terms,” I’d say:

“This is a pricing data product built from foundational inputs, transformed through an orchestration step into a reliable derived output used across the platform.”

🤯

But honestly?

The simpler explanation still wins:

“It’s one table that always had the right price.”

Both are true. Just different levels of vocabulary.


What I Learned From This

Some personal lessons that stick with me:

  1. You can build something modern without knowing the terminology.
  2. Words can make things sound complex — the concepts are often simple.
  3. If systems are predictable and trusted, you’re doing it right.
  4. Data engineering becomes clearer when tied to real experiences.
  5. Practical work teaches faster than reading definitions.
  6. Sometimes you understand what you built only years later.

If You’re Building Something Similar

My advice is simple:

  • Clean your inputs.
  • Name your outputs clearly.
  • Keep the logic centralised.
  • Don’t overcomplicate orchestration.
  • Make your data trustworthy.
  • Let the fancy terms come after the architecture makes sense.

Understanding follows experience — not the other way around.


Thanks for reading. Writing this helped me understand these concepts in my own language.

P.S. The idea, the story, and the experience are 100% mine — the only thing AI helped with was turning my brain dump into readable English. So if anything sounds unusually polished, blame the robot, not me. 😄