Skip to main content
In the previous module, we learned that Machine Learning (ML) is a subset of Artificial Intelligence. It’s not just a part of AI; it’s the engine that powers most of the AI advancements we see today. So, what exactly is it?

The Core Idea: Learning From Data

The traditional way to make a computer do something is to give it explicit, step-by-step instructions. This is called explicit programming. For example, to write a program that calculates a 10% discount, you would write a rule like this: final_price = original_price * 0.90. The program can’t do anything else; it just follows the rule you wrote.

Machine Learning flips this on its head.

Instead of giving the computer rules, you give it data. The ML algorithm then processes this data and “learns” the patterns and rules on its own. Arthur Samuel, an AI pioneer, defined it perfectly back in 1959: Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed.

Analogy: Learning to Identify Animals

  • Explicit Programming: You could try to write a set of rules for a computer to identify a cat: “If it has pointy ears, AND whiskers, AND a long tail, AND fur… then it’s a cat.” This is brittle and will fail quickly. What about a cat with folded ears or no tail?
  • Machine Learning: You show the computer thousands of pictures labeled “cat” and thousands of pictures labeled “not a cat.” The algorithm analyzes all these examples and learns the underlying patterns of what makes a cat a cat on its own. After this “training,” it can accurately identify cats in new pictures it has never seen before. It learned the “rules” from the data, rather than being handed them.

The Typical Machine Learning Workflow

While there are many different types of ML models, most projects follow a similar high-level process. Understanding this workflow is key to understanding how ML is applied in the real world. Img4 Pn
  1. Data Collection: This is the foundation of any ML project. You need to gather large amounts of relevant, high-quality data. The data could be anything from spreadsheets and images to text and sensor readings. The quality and quantity of this data will directly impact the performance of your model.
  2. Data Preparation (or “Data Preprocessing”): Raw data is almost never ready to be used. This is often the most time-consuming step. It involves:
    • Cleaning: Fixing errors, handling missing values.
    • Formatting: Ensuring the data is in a consistent and usable format.
    • Splitting: Dividing the data into a “training set” (for the model to learn from) and a “testing set” (to evaluate its performance later).
  3. Model Training: This is where the “learning” happens. You select an appropriate ML algorithm (like the ones we’ll discuss in the next sections) and feed it your prepared training data. The algorithm processes the data, adjusts its internal parameters, and creates a “model”—a mathematical representation of the patterns it found.
  4. Evaluation: Once the model is trained, you need to see how well it works. You use the “testing set” (which the model has never seen before) to check its accuracy and performance. This step answers the question: “Did the model actually learn the right things, and can it make accurate predictions on new data?”
  5. Deployment & Monitoring: If the model’s performance is good, it’s ready to be put into a real-world application. This is called deployment. For example, a spam detection model would be integrated into an email service. The work isn’t over, though. You need to continuously monitor the model’s performance to ensure it remains accurate over time as new data comes in.
I