An advanced machine learning project for predicting product prices using ensemble methods, meta-learning, and computer vision techniques. Achieved 50.12% SMAPE using sophisticated stacking approaches.
# Clone the repository
git clone <repository-url>
cd amazonml-price-prediction
# Install dependencies
pip install -r requirements.txt
# Run the best model (Meta-Learning)
cd models
python meta_learning_model.py├── models/ # 🎯 Main optimized models (BEST)
│ ├── meta_learning_model.py # 🥇 Best: 50.12% SMAPE
│ ├── neural_enhanced_model.py # 🥈 50.45% SMAPE
│ └── computer_vision_model.py # 🥉 ~50% SMAPE
│
├── experiments/ # 🧪 Research & development
│ ├── ensemble/ # Ensemble approaches
│ ├── optimization/ # Advanced optimizations
│ └── legacy/ # Earlier experiments
│
├── results/ # 📊 Model outputs & analysis
├── scripts/ # 🔧 Utility scripts
├── src/ # 📚 Core utilities
├── dataset/ # 💾 Training/test data
└── image_cache/ # 🖼️ Downloaded images (2,183 files)
| Model | SMAPE | Status | Features |
|---|---|---|---|
| Meta-Learning | 50.12% | ✅ Best | Advanced stacking + comprehensive features |
| Neural Enhanced | 50.45% | 🥈 Second | Deep learning + feature interactions |
| Computer Vision | ~50% | 🧪 Experimental | Image features + text analysis |
| Ensemble Models | 70-80% | ❌ Need work | Various ensemble attempts |
🎯 Target: <48% SMAPE | 📈 Best Achievement: 50.12% SMAPE
- 🧠 Meta-Learning: Advanced stacking with 7 base models
- 🖼️ Computer Vision: Real image feature extraction (2,183+ images)
- 📝 NLP: Comprehensive text feature engineering
- ⚡ Ensemble Methods: Multiple ensemble approaches tested
- 📊 Robust Validation: Cross-validation with proper SMAPE optimization
- Python 3.8+
- pandas, numpy, scikit-learn
- lightgbm, xgboost
- PIL (for image processing)
- See
requirements.txtfor full list
cd models
python meta_learning_model.pycd models
python computer_vision_model.pycd scripts
python download_computer_vision_images.py- Training Data: 75,000 samples
- Test Data: 75,000 samples
- Image Dataset: 140,587 unique URLs (2,183+ downloaded)
- Feature Engineering: 600+ features per model
- Cross-Validation: 5-fold stratified
See results/ folder for detailed performance analysis.
The experiments/ folder contains extensive research:
- Ensemble Methods: Gradient boosting combinations
- Optimization: Advanced hyperparameter tuning
- Legacy Models: Early development iterations
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Advanced machine learning techniques
- Meta-learning and stacking approaches
- Computer vision for e-commerce
- Ensemble method research
⭐ Star this repo if it helped you!