Types of Data
Root Concept
Data comes in different forms — structured data is highly organized and easy to analyze, while unstructured data is raw and requires significant processing.
CodePLU Goal
Upgrading Human Mental Models
Learn how to think in Workflows
Concept Development By codeplu.com
Mapping real-world information into the three core data types
What are Types of Data?
In the digital world, data is not always captured in the exact same format. Before a Data Scientist can analyze information, they must first understand how organized or unorganized that information is.
Data fundamentally falls into three categories: Structured data (which follows a strict, fixed mathematical format), Unstructured data (which is entirely raw and follows no rigid rules), and Semi-structured data (which sits somewhere in the middle). Knowing your data type dictates exactly what tools you must use to analyze it.
How Different Data Types Work
Structured Data
This type of data is highly organized and neatly formatted into rigid rows and columns. Because it follows strict mathematical rules, it is incredibly easy for computers to store, search, and analyze using standard tools. Common examples include financial transaction records in an Excel spreadsheet or user accounts in a SQL database. What we get in the end is perfectly clean data ready for immediate statistical analysis.
Unstructured Data
This type of data is completely raw and has absolutely no fixed internal format or pre-defined model. It makes up the vast majority of data created today and includes things like messy text documents, customer reviews, social media posts, images, videos, and audio files. Since it is not naturally organized, it requires heavy, complex processing (like AI computer vision or natural language processing) before a machine can understand or analyze it.
Semi-structured Data
This type sits right between structured and unstructured data. It does not live in a strict table format with rows and columns, but it does contain organizational tags, markers, or hierarchies to separate semantic elements. For example, JSON or XML files used in web development. It provides great flexibility for programmers while still maintaining enough structure for algorithms to easily parse the information.
Real World Example
Social Media Data Analysis
A breakdown of how a single user interaction on a social app generates completely different types of data that must all be analyzed together.
Structured Data
When you create an account, your basic user profile (Age: 25, Location: New York, Join Date: 2026-01-15) is neatly stored in a rigid, highly searchable SQL database.
Unstructured Data
The actual content you generate every day—your long text posts, the comments you leave, the photos you upload, and the live videos you stream—is entirely unstructured and raw.
Semi-structured Data
When the mobile app requests your feed from the server, the backend API responds with a JSON file. This file uses semi-structured tags to bundle your structured User ID with your unstructured photo so your phone knows how to display it.
The Analysis Challenge
To truly understand user behavior, companies must build complex pipelines that analyze all of this together—for example, mathematically matching a structured demographic profile with the unstructured emotional sentiment of their text posts.
FAQs
Final Words
Deeply understanding the different types of data is absolutely essential in Data Science. It acts as the compass that helps you choose the correct tools and algorithms for processing and analysis.
Once you instinctively know how data is structured (or unstructured), you can confidently architect pipelines to handle massive, messy real-world datasets effectively.