From Pixels to Predictions: Building Your First Fashion Classifier with Keras



Ever wondered how a website can instantly recognize the type of clothing in a photo? It might seem like magic, but it’s the power of neural networks at work. In this guide, we’ll walk through building your very own neural network to classify articles of clothing. We’ll use the popular Keras library in python, making the process surprisingly straightforward. Let’s get started!

#Meet the Dataset

Every great machine learning model starts with great data. For this project, we’ll use the Fashion MNIST dataset, which is conveniently included in Keras. It’s a fantastic starting point for computer vision.

This dataset contains 70,000 grayscale images of clothing items, split into 60,000 for training our model and 10,000 for testing its performance.

import tensorflow as tf
from tensorflow import keras
import numpy as np

# Load the dataset
fashion_mnist = keras.datasets.fashion_mnist 
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

Each image is a small, 28x28 pixel grid. This means we have 60,000 images, each represented by a 28x28 array of numbers. The numbers themselves are pixel intensity values, ranging from 0 (black) to 255 (white).

The labels are simply numbers from 0 to 9, each corresponding to a specific category of clothing. To make sense of them, we can create a list of class names:

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

So, a label of 2 means the image is a ‘Pullover’, and a 9 means it’s an ‘Ankle boot’.

#Prepping the Data

Before we can feed our images to the model, we need to do a little bit of prep work. This step, known as preprocessing, is crucial for helping our model learn efficiently.

Our pixel values are currently in the range of 0-255. We’re going to scale them down to a range of 0 to 1. Why? Because neural networks generally handle smaller, normalized values much better. The process is simple: we just divide every pixel value by 255.0.

train_images = train_images / 255.0
test_images = test_images / 255.0

#Designing Our Neural Network 🧠

Now for the exciting part—building our model’s brain! We’ll use a keras.Sequential model, which is a straightforward stack of layers. Our network will have three layers, each with a specific job.

model = keras.Sequential([
    # Input Layer
    keras.layers.Flatten(input_shape=(28, 28)),
    
    # Hidden Layer
    keras.layers.Dense(128, activation='relu'),
    
    # Output Layer
    keras.layers.Dense(10, activation='softmax')
])

Let’s break that down:

  • Input Layer: The Flatten layer is our entry point. Its job is to take the 28x28 grid of pixels and “unroll” it into a single, flat line of 784 neurons (28 * 28 = 784). This prepares the data for the next layer.
  • Hidden Layer: This is a Dense layer, which means every neuron in it is connected to every neuron from the previous layer. It has 128 neurons and uses the popular ‘relu’ (Rectified Linear Unit) activation function. Think of this layer as where the model does most of its “thinking,” learning to find patterns in the data.
  • Output Layer: This final Dense layer has 10 neurons—one for each of our clothing classes. It uses a ‘softmax’ activation function, which is perfect for classification. Softmax converts the layer’s raw output into a probability distribution. In other words, each of the 10 neurons will output a value between 0 and 1, representing the model’s confidence that the image belongs to that class. All 10 probabilities will add up to 1.

#Setting the Rules with compile()

Before training, we need to configure the model by defining a few key things:

  • Optimizer: This is the algorithm that adjusts the model’s internal parameters to minimize error. ‘adam’ is a popular and effective choice.
  • Loss Function: This is how the model measures how wrong its predictions are. Since we have multiple categories, ‘sparse_categorical_crossentropy’ is the right tool for the job.
  • Metrics: This is what we want to monitor during training. We’ll track ‘accuracy’ to see what percentage of images are correctly classified.
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

#Let the Training Begin!

With all the setup done, training the model is as simple as calling one command. We’ll pass it our training data and labels and tell it to run for 10 epochs, which means it will go through the entire training dataset 10 times.

model.fit(train_images, train_labels, epochs=10)

You’ll see the accuracy improve with each epoch as the model learns!

Epoch 1/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 7s 3ms/step - accuracy: 0.7812 - loss: 0.6275
Epoch 2/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 5s 3ms/step - accuracy: 0.8635 - loss: 0.3816
Epoch 3/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 6s 3ms/step - accuracy: 0.8790 - loss: 0.3375
Epoch 4/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 5s 3ms/step - accuracy: 0.8841 - loss: 0.3119
Epoch 5/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 6s 3ms/step - accuracy: 0.8912 - loss: 0.2927
Epoch 6/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 10s 3ms/step - accuracy: 0.8968 - loss: 0.2779
Epoch 7/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 6s 3ms/step - accuracy: 0.9011 - loss: 0.2636
Epoch 8/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 5s 3ms/step - accuracy: 0.9042 - loss: 0.2583
Epoch 9/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 6s 3ms/step - accuracy: 0.9076 - loss: 0.2493
Epoch 10/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 5s 3ms/step - accuracy: 0.9115 - loss: 0.2367
<keras.src.callbacks.history.History at 0x7c611459cd40>

#How Did We Do? Evaluating the Model

Once training is complete, it’s time to see how our model performs on data it has never seen before: our test set.

test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=1)

print('\nTest accuracy:', test_acc)

You’ll likely notice that the test accuracy is a little lower than the final training accuracy.

313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8733 - loss: 0.3425

Test accuracy: 0.8755000233650208

This is a common phenomenon called overfitting, where the model becomes slightly too specialized to the training data. It’s a key challenge in machine learning!

#Making Predictions ✨

The moment of truth! Let’s use our trained model to make a prediction on an image. The model.predict() method takes an array of images and returns the model’s output for each.

predictions = model.predict(test_images)

Let’s look at the prediction for the very first test image:

print(predictions[0])
# Result:
[1.4218572e-06 3.7644535e-13 2.7878860e-08 7.9793567e-09 3.7700011e-07 2.3166942e-03 4.1341696e-08 5.6924592e-03 3.4447805e-08 9.9198902e-01]

This array of 10 numbers represents the model’s confidence for each of the 10 clothing classes. The highest value corresponds to the most likely class. We can use NumPy’s argmax function to find the index of that highest value.

# Find the index with the highest probability
predicted_class = np.argmax(predictions[0])
print(predicted_class)
# Output: 9

The model predicts class 9. Let’s check our class_names list… index 9 is ‘Ankle boot’.

Was it right? Let’s look at the actual label for that image:

# Check the actual label
print(test_labels[0])
# Output: 9

Success! The model correctly identified the first image as an Ankle boot.

Congratulations, you’ve just built, trained, and used a neural network!