Lesson 03 / 07

Data and databases, explained

What a database is, how SQL and NoSQL differ, what a good data model looks like, and why your data is the biggest asset you own.

Ask anyone who’s ever had to live with an old business system what turned out to be the most valuable thing about the whole project. They won’t say the buttons or the screens. They’ll say the data. Code can be rewritten, design redrawn, a server swapped out in an hour. But the list of your customers, orders, and invoices that took five years to fill up? That doesn’t come back a second time. This lesson is about understanding your data well enough that you never lose that asset.

What a database is and why everything else rests on it

A database is organized storage for data. No magic, just a very reliable and very fast filing cabinet that can find the right page among millions of others in a fraction of a second and never loses it.

When you picture your software, most of what you see is just different views of the same data. A customer’s profile, their orders, an invoice, a number in a report, all of it reads from the same database. The app, the buttons, the forms, that’s the plumbing around it. The water flowing through those pipes is your data.

That’s why it pays to think about the data before you think about how it looks. With cleanly stored data you can build practically anything on top of it. With messy data, the prettiest screen in the world won’t save you.

SQL versus NoSQL, without the dogma

Databases roughly split into two camps, and there are more holy wars over that choice than over almost anything else. Let’s get it out of the way without the drama.

  • Relational (SQL). Data in tables, like in a spreadsheet, only far stricter. A customers table, an orders table, and clearly defined links between them. The database itself makes sure an order can’t exist without a customer. For the vast majority of business applications this is the right and safe choice: PostgreSQL or MySQL won’t let you down.
  • NoSQL. A looser world of documents, where every record can look different. It fits specific cases where the data has no fixed shape or where truly massive amounts of it flow through, like logs, sensor signals, or search indexes.

We’ll say it bluntly: people often pick NoSQL because it sounds modern, then learn the expensive way that they’re missing things a relational database does for free, above all keeping the connections between data honest. When in doubt, start with a relational database. It’s the boring, proven, correct answer nine times out of ten.

Never choose a database based on what’s in fashion or what “everyone is using right now.” Choose based on the shape of your data. Trends fade; your data sticks around for years.

A good data model is the foundation of the whole build

A data model is simply a list of the things your software remembers, and the relationships between them. In the world of data, those things are called entities.

Take an online shop. You have customers, orders, products. One customer has many orders. One order contains many products. One product shows up in many orders. Sort those relationships out at the start, on paper over coffee, and you save yourself a world of pain later.

Now the important part: the model is decided before the look of the app. The screens, after all, just look at the data. Changing the color of a button is five minutes of work. But if halfway through you discover that a single invoice is supposed to belong to several companies at once, and you didn’t design for it, you’re rewriting half the system along with the data that’s already in it.

That’s why a good vendor wants to understand your business first, not draw screens. A foundation poured wrong doesn’t get redone cheaply once it’s set in concrete.

Your data is your biggest asset, so treat it like one

If you take one single thing away from this lesson, make it this: you have to be able to get to all of your data, any time. Before you sign anything, go through this list:

  • Ownership. A good contract states in black and white that the data is yours, not the vendor’s. It sounds obvious. It isn’t.
  • Export. You need to be able to download all of your data in a format someone else can open too (CSV, JSON, even a database dump). “Export” means completely, not just the five columns they happen to show you on screen.
  • Backups. Regular, automatic, and above all tested. A backup you’ve never actually restored from isn’t a backup, it’s wishful thinking.
  • Where the data physically lives. A server in the EU, or somewhere across an ocean? With personal data that’s not cosmetic, it’s a GDPR question.
  • GDPR. If you collect information about people, you have to know what, why, and for how long. And be able to delete it on request. That’s not just a job for the lawyer, it starts at the data model.

Common mistakes that will come back to hurt you

Most data disasters aren’t movie hackers. They’re ordinary things companies put off for years until they blow up.

  • Critical data living only in someone’s spreadsheet. Your whole price list, inventory, or order log in one file on a single colleague’s laptop. One stolen device, one bout of illness, or one resignation, and the company loses an arm.
  • No backups. A disk fails eventually, it’s only a matter of when. Without a backup, the day it fails is the last day of your data.
  • A vendor who holds your data hostage. The site or system runs, but you can’t get the data out, except for a ransom, or not at all. It’s the quietest form of vendor lock-in there is; how to guard against it while you’re still choosing a partner is something we dig into in the lesson on technology and the team.
  • Collecting data you have no right or reason to hold. Every unnecessary piece of personal data is a risk, not an asset. What you don’t have can’t leak and doesn’t need guarding. Collect only what you actually use for something.

Data outlives the code, the design, and your vendor. So put it first: understand what your software remembers, keep it backed up, and always be able to take it with you. Everything else can be fixed. Lost data can’t.