A comprehensive machine learning pipeline comparing deep learning models (CNN, ResNet) with traditional methods for classifying fashion items into 10 categories with 91% accuracy.
// Overview
This research project explores the application of machine learning techniques to the Fashion MNIST dataset, focusing on classifying 28×28 grayscale images of fashion items into ten distinct categories including T-shirts, trousers, pullovers, dresses, coats, sandals, shirts, sneakers, bags, and ankle boots.
The project implements a comprehensive ML pipeline comparing deep learning architectures (CNN and ResNet) with traditional methods such as Logistic Regression, SVM, and Random Forest. Preprocessing steps including normalization and data reshaping were employed to optimize model performance.
Grad-CAM visualizations enhance the interpretability of CNN predictions, building trust in model decisions for real-world applications in e-commerce, fashion retail, and manufacturing.
// Architecture
Deep learning model utilizing residual connections to address vanishing gradient issues, enabling deeper architectures with robust generalization.
Extracts spatial features from images using convolutional and pooling layers for high accuracy image classification tasks.
An ensemble method aggregating decision trees, providing reasonable performance on flattened image data.
Employs kernel methods for non-linear classification, computationally expensive but effective for complex patterns.
A fully connected neural network, limited in performance due to lack of spatial awareness in image data.
A simple linear classifier used as a baseline, limited in handling complex image patterns.
// Performance
| Class | Precision | Recall | F1-Score |
|---|---|---|---|
| T-shirt/top | 0.81 | 0.91 | 0.86 |
| Trouser | 0.98 | 0.99 | 0.98 |
| Pullover | 0.87 | 0.88 | 0.88 |
| Dress | 0.95 | 0.88 | 0.91 |
| Coat | 0.88 | 0.87 | 0.88 |
| Sandal | 0.96 | 0.99 | 0.97 |
| Shirt | 0.79 | 0.71 | 0.74 |
| Sneaker | 0.94 | 0.98 | 0.96 |
| Bag | 0.97 | 0.99 | 0.98 |
| Ankle boot | 1.00 | 0.92 | 0.96 |
// Explainability
// Roadmap
Extend models to handle RGB images with complex backgrounds for improved real-world applicability.
Address misclassifications between similar items like T-shirts and Shirts through targeted training.
Explore combinations of traditional and deep learning techniques for enhanced accuracy and efficiency.
// Downloads
Complete documentation including methodology, results, and analysis.
Final project presentation with key findings and visualizations.
Complete Python implementation with CNN, ResNet, and traditional ML models.
// Implementation
// Contributors
kathrey@asu.edu
asinh117@asu.edu
achatt53@asu.edu