Computer Science · Core concepts

Data and databases

Lesson 4

Data and databases

6 min read
AI Explain — Ask anything
AI Illustrate — Make it visual

Why This Matters

# Data and Databases - Cambridge IB Computer Science Summary ## Key Learning Outcomes Students will master the fundamental concepts of relational database design, including entity-relationship modelling, normalisation (up to 3NF), and SQL query construction for data manipulation and retrieval. The course covers data types, structures, and the distinction between primary and foreign keys, emphasising how databases ensure data integrity, reduce redundancy, and support concurrent access. Practical skills include creating efficient database schemas, writing complex SQL queries (SELECT, JOIN, WHERE clauses), and understanding transaction processing and ACID properties. ## Exam Relevance This topic features prominently in both Paper 1 (core content) and the internal assessment, with typical questions requiring students to design ER diagrams, normalise relations, write SQL queries, and evaluate database implementations against real-world scenarios. Candidates must demonstrate both theoretical understanding and practical application skills.

Key Words to Know

01
Data — Raw facts, figures, or pieces of information, like a name, number, or image.
02
Database — An organized collection of data, stored and accessed electronically, like a digital filing cabinet.
03
Database Management System (DBMS) — The software that allows users to create, maintain, and interact with a database.
04
Query — A request for information from a database, often phrased as a question the computer understands.
05
Table — A collection of related data organized in rows and columns within a relational database, similar to a spreadsheet.
06
Record (Row) — A single entry in a database table, containing all the information about one specific item or person.
07
Field (Column) — A specific category of information within a database table, like 'Name' or 'Age'.
08
Primary Key — A unique identifier for each record in a table, ensuring no two records are exactly alike (e.g., a student ID number).
09
Relational Database — A database that stores and provides access to data points that are related to one another, typically organized into tables.
10
Data Integrity — Ensuring that data is accurate, consistent, and reliable throughout its lifecycle.

Core Concepts & Theory

Data refers to raw, unprocessed facts and figures without context or meaning. Information is processed data that has been organized and given context, making it meaningful and useful for decision-making.

Databases are organized collections of structured data stored electronically, designed for efficient retrieval, management, and updating. A Database Management System (DBMS) is software that facilitates creating, maintaining, and querying databases.

Key Database Terminology:

Entity - A distinct object or concept about which data is stored (e.g., Student, Product)

Attribute - A characteristic or property of an entity (e.g., StudentName, ProductPrice)

Primary Key - A unique identifier for each record in a table, ensuring no duplicates exist

Foreign Key - An attribute in one table that references the primary key of another table, establishing relationships

Relational Database - Data organized into tables (relations) with rows (records/tuples) and columns (attributes/fields), linked through keys

Flat File Database - A single-table database storing all data in one structure, suitable only for simple applications

Normalization - The process of organizing data to minimize redundancy and dependency, typically through First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF)

Data Integrity - Ensuring accuracy, consistency, and reliability of data throughout its lifecycle

Validation - Checking data meets specified rules before entry (e.g., range check, type check)

Verification - Confirming data has been accurately transferred or entered (e.g., double-entry, visual check)

Cambridge Command Words: Define requires precise, concise explanations. Explain demands reasoning with cause-and-effect. Describe needs characteristics without justification.

Detailed Explanation with Real-World Examples

Think of data versus information like ingredients versus a recipe. Raw numbers "25, 12, 2023" are data—meaningless alone. Processed as "25th December 2023" becomes information with context.

Real-World Database Applications:

E-commerce platforms like Amazon use relational databases with entities: Customer, Order, Product, Payment. A customer's order connects through foreign keys—CustomerID in the Order table references the Customer table's primary key. This prevents storing duplicate customer addresses for each order (redundancy reduction).

School Management Systems demonstrate normalization beautifully. Instead of one massive table repeating teacher details for every class they teach, separate tables exist: Teacher (TeacherID, Name, Department), Class (ClassID, Subject, TeacherID). The TeacherID foreign key links them, eliminating data duplication.

Healthcare databases prioritize data integrity. A patient's blood type must validate against permitted values (A, B, AB, O) with + or - modifiers. Verification through double-entry by different medical staff catches critical errors.

Flat files versus relational databases: Imagine a spreadsheet listing students with their courses—each student-course combination repeats student details (name, email, address). This flat file causes update anomalies: changing a student's email requires finding every row they appear in. A relational approach splits this into Student and Course tables linked through an Enrollment table, updating email once.

Primary keys in practice: Your passport number uniquely identifies you globally—it's your primary key in immigration databases. No two people share it, ensuring accurate tracking. Similarly, ISBNs uniquely identify books, preventing confusion between editions.

Memory Aid - ACID properties: Atomicity, Consistency, Isolation, Durability ensure reliable database transactions—like a complete bank transfer (not half-processed).

Worked Examples & Step-by-Step Solutions

Example 1: Normalization to 3NF

Question: Normalize this table to Third Normal Form:

StudentIDStudentNameCourseIDCourseNameInstructorName
101AliceCS101ProgrammingDr. Smith
101AliceMA201CalculusProf. Lee

Solution:

Step 1 - 1NF: Already achieved (atomic values, no repeating groups, primary key exists)

Step 2 - 2NF: Remove partial dependencies. Composite key (StudentID, CourseID) exists. CourseName depends only on CourseID (partial dependency). Split:

Student_Course Table: StudentID, CourseID

Course Table: CourseID (PK), CourseName, InstructorName

Step 3 - 3NF: Remove transitive dependencies. InstructorName depends on CourseID through CourseName (not directly on key). Final structure:

Enrollment: StudentID (FK), CourseID (FK)

Student: StudentID (PK), StudentName

Course: CourseID (PK), CourseName, InstructorID (FK)

Instructor: InstructorID (PK), InstructorName

Examiner Note: Award marks for identifying dependency types and systematic progression through normal forms.


Example 2: Validation vs. Verification

Question (4 marks): Explain the difference between validation and verification, giving one example of each.

Model Answer:

Validation checks data meets predefined rules before entry [1], such as a range check ensuring age values between 0-120 [1]. Verification confirms data has been accurately entered or transferred [1], such as double-entry password systems requiring identical inputs twice [1].

Examiner Tip: Clear distinction + specific, relevant examples = full marks.

Common Exam Mistakes & How to Avoid Them

Mistake 1: Confusing Data with Information

Why it happens: Students treat terms interchangeably.

How to avoid:...

This section is locked

Cambridge Exam Technique & Mark Scheme Tips

Command Word Mastery:

Define (1-2 marks): Concise, precise meaning only. "A primary key is a unique identifier ...

This section is locked

2 more sections locked

Upgrade to Starter to unlock all study notes, audio listening, and more.

Exam Tips

  • 1.Always define key terms like 'data', 'database', 'DBMS', 'query', 'table', 'record', and 'field' clearly and concisely.
  • 2.Be ready to provide a real-world example of a database in action (e.g., online shopping, school records, social media) and explain how data is stored and retrieved.
  • 3.Understand the difference between data (the raw facts) and a database (the organized container for those facts).
  • 4.Know the purpose of a primary key and why it's important for uniquely identifying records.
  • 5.Practice explaining the benefits of using a database over a simple spreadsheet for managing large amounts of information (e.g., efficiency, security, multi-user access).
Ask Aria anything!

Your AI academic advisor