An end-to-end AI-powered analytics platform for e-commerce data processing, featuring automated sentiment analysis, KPI generation, and multi-language client support.
This project is a comprehensive data engineering solution designed to:
- Extract e-commerce data from various sources (Supabase storage, APIs)
- Transform raw data using AI-powered sentiment analysis and automated KPI calculations
- Load processed data into databases and storage systems for analytics
- Provide multi-language client interfaces (Python, Go) for data interaction
- Generate actionable insights from customer reviews, sales data, and user behavior
- π€ AI-Powered Sentiment Analysis: Uses local LLM (llama.cpp) for review classification
- π Automated KPI Generation: Shop performance, user metrics, and time-series analytics
- π³ Containerized Architecture: Docker-based microservices for scalability
- π Modular ETL Pipeline: Clean separation of extract, transform, and load operations
- π Multi-Client Support: Python and Go clients
- π Real-time Processing: Streaming and batch processing capabilities
AI-Powered-E-commerce-Analytics/
βββ README.md # This file - Project overview and setup
βββ .env.example # Environment variables template
βββ config.yaml # Global configuration file
βββ docker-compose.yml # Multi-service orchestration
βββ .gitignore # Git ignore patterns
βββ
βββ etl_pipeline/ # π Main ETL Pipeline (Python)
β βββ src/
β β βββ etl_pipeline/
β β βββ main.py # Pipeline orchestrator
β β βββ models/ # Pydantic data models
β β βββ extract/ # Data extraction module
β β βββ transform/ # AI transformation module
β β βββ load/ # Data loading module
β β βββ utils/ # Common utilities
β βββ Dockerfile # ETL container config
β βββ requirements.txt # Python dependencies
β βββ README.md # ETL-specific documentation
β
βββ Clients/ # π₯ Client Applications
β βββ go/ # π· Go Client
β β βββ cmd/main.go # Go application entry point
β β βββ internal/ # Internal Go packages
β β β βββ models/ # Data structures
β β β βββ extract/ # Data extraction
β β β βββ enrichment/ # AI enrichment
β β β βββ load/ # Data loading
β β βββ pkg/utils/ # Shared utilities
β β βββ Dockerfile # Go client container
β β βββ README.md # Go client docs
β β
β βββ python/ # π Python Clients
β βββ llama_cpp_client.py # Direct llama.cpp integration
β βββ ollama_client.py # Ollama client wrapper
β
βββ collect/ # π‘ Data Collection Service
β βββ collector.py # Data collection logic
β βββ Dockerfile # Collector container
β βββ requirements.txt # Collector dependencies
β
βββ llama.cpp/ # π§ AI/LLM Infrastructure (Git Submodule)
β βββ models/ # LLM model files (.gguf)
β βββ .devops/ # Docker configs for llama.cpp
β
βββ logs/ # π Application Logs
β βββ http_requests.log # HTTP request logs
β
βββ modelfiles/ # π Model Configuration Files
βββ Modelfile # LLM model definitions
- Purpose: Core data processing engine
- Technology: Python with Polars, Pydantic, OpenAI API
- Features: Modular architecture, async processing, AI integration
- Go Client: High-performance concurrent processing
- Python Clients: Flexible scripting and prototyping
- Purpose: Web scraping and data ingestion
- Features: Rate limiting, error handling, data validation
- Purpose: Local LLM server for sentiment analysis
- Technology: llama.cpp with CUDA acceleration
- Models: Gemma, OpenHermes, and other GGUF models
The storage bucket follow the medalian schema as follow
βββ Bronze/ # raw data
β βββ old/ # the raw data that have been Enriched
β βββ new/ # the raw data that waiting to be Enriched
βββ Silver/ # Enriched data
β βββ processed/ # The data that have been enriched and used to generate kpis
β βββ to_process/ # The data that have been enriched and waiting to generate kpis
βββ gold/ # final data
β
Create 3 Database in supabase that follow these schema :
{'id': String, 'average_spent': Float64, 'positive_reviews': UInt32, 'negative_reviews': UInt32, 'likeness_score': Float64, 'normalized_likeness_score': Float64}
{'shop_id': String, 'average_profit': Float64, 'positive_reviews': UInt32, 'negative_reviews': UInt32, 'likeness_score': Float64, 'normalized_likeness_score': Float64})
{'date': String, 'average_profit_per_day': Float64}
git clone --recursive https://github.com/aymen-fkir/AI-Powered-E-commerce-Analytics.git
cd AI-Powered-E-commerce-Analytics# Copy environment template
cp .example.env .env
# Edit with your credentials
nano .envRequired environment variables:
project_url=https://your-project.supabase.co
project_key=your_supabase_service_key
project_jwt_key=your-supabase-jwt-key
api_key=marcove-api-keyEdit config.yaml to match your setup:
# Supabase storage paths
supabase:
bucketName: 'your-bucket-name'
# AI model configuration
ETLCONFIG:
model: 'gemma-3-1b'
base_url: 'http://localhost:8000/v1/'# Build and start all containers
docker-compose up -d
# View logs
docker-compose logs -f# LLM server only
docker-compose up llama
# ETL pipeline only
docker-compose up etl
# Go client only
docker-compose up go-client
# Data collector only
docker-compose up collector- LLM Server: http://localhost:8000
- Go Client: http://localhost:8080
- ETL Pipeline: http://localhost:5000
- Data Collector: http://localhost:5050
- Create new module in
etl_pipeline/src/etl_pipeline/transform/ - Implement transformation logic
- Register in main pipeline orchestrator
- Create new client directory in
Clients/ - Implement core interfaces (extract, transform, load)
- Add Docker configuration
- Add model files to
llama.cpp/models/ - Update
config.yamlwith model configuration - Restart llama service
Note: This project demonstrates modern data engineering practices with AI integration. Perfect for learning microservices architecture, containerization, and ML operations. you can read about my Project in This blog