Skip to content

AI-Powered E-commerce Analytics is an end-to-end data engineering platform that processes e-commerce data using artificial intelligence. It extracts data from various sources, transforms it using local LLM-powered sentiment analysis, and generates automated KPIs for business insights.

Notifications You must be signed in to change notification settings

aymen-fkir/AI-Powered-E-commerce-Analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

26 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI-Powered E-commerce Analytics πŸ›οΈπŸ€–

An end-to-end AI-powered analytics platform for e-commerce data processing, featuring automated sentiment analysis, KPI generation, and multi-language client support.

🎯 Project Goal & What It Does

This project is a comprehensive data engineering solution designed to:

  • Extract e-commerce data from various sources (Supabase storage, APIs)
  • Transform raw data using AI-powered sentiment analysis and automated KPI calculations
  • Load processed data into databases and storage systems for analytics
  • Provide multi-language client interfaces (Python, Go) for data interaction
  • Generate actionable insights from customer reviews, sales data, and user behavior

Key Features:

  • πŸ€– AI-Powered Sentiment Analysis: Uses local LLM (llama.cpp) for review classification
  • πŸ“Š Automated KPI Generation: Shop performance, user metrics, and time-series analytics
  • 🐳 Containerized Architecture: Docker-based microservices for scalability
  • πŸ”„ Modular ETL Pipeline: Clean separation of extract, transform, and load operations
  • 🌐 Multi-Client Support: Python and Go clients
  • πŸ“ˆ Real-time Processing: Streaming and batch processing capabilities

πŸ“ Repository Structure

AI-Powered-E-commerce-Analytics/
β”œβ”€β”€ README.md                           # This file - Project overview and setup
β”œβ”€β”€ .env.example                        # Environment variables template
β”œβ”€β”€ config.yaml                         # Global configuration file
β”œβ”€β”€ docker-compose.yml                  # Multi-service orchestration
β”œβ”€β”€ .gitignore                          # Git ignore patterns
β”œβ”€β”€ 
β”œβ”€β”€ etl_pipeline/                       # πŸ”„ Main ETL Pipeline (Python)
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   └── etl_pipeline/
β”‚   β”‚       β”œβ”€β”€ main.py                 # Pipeline orchestrator
β”‚   β”‚       β”œβ”€β”€ models/                 # Pydantic data models
β”‚   β”‚       β”œβ”€β”€ extract/                # Data extraction module
β”‚   β”‚       β”œβ”€β”€ transform/              # AI transformation module
β”‚   β”‚       β”œβ”€β”€ load/                   # Data loading module
β”‚   β”‚       └── utils/                  # Common utilities
β”‚   β”œβ”€β”€ Dockerfile                      # ETL container config
β”‚   β”œβ”€β”€ requirements.txt                # Python dependencies
β”‚   └── README.md                       # ETL-specific documentation
β”‚
β”œβ”€β”€ Clients/                            # πŸ‘₯ Client Applications
β”‚   β”œβ”€β”€ go/                             # πŸ”· Go Client
β”‚   β”‚   β”œβ”€β”€ cmd/main.go                 # Go application entry point
β”‚   β”‚   β”œβ”€β”€ internal/                   # Internal Go packages
β”‚   β”‚   β”‚   β”œβ”€β”€ models/                 # Data structures
β”‚   β”‚   β”‚   β”œβ”€β”€ extract/                # Data extraction
β”‚   β”‚   β”‚   β”œβ”€β”€ enrichment/             # AI enrichment
β”‚   β”‚   β”‚   └── load/                   # Data loading
β”‚   β”‚   β”œβ”€β”€ pkg/utils/                  # Shared utilities
β”‚   β”‚   β”œβ”€β”€ Dockerfile                  # Go client container
β”‚   β”‚   └── README.md                   # Go client docs
β”‚   β”‚
β”‚   └── python/                         # 🐍 Python Clients
β”‚       β”œβ”€β”€ llama_cpp_client.py         # Direct llama.cpp integration
β”‚       └── ollama_client.py            # Ollama client wrapper
β”‚
β”œβ”€β”€ collect/                            # πŸ“‘ Data Collection Service
β”‚   β”œβ”€β”€ collector.py                    # Data collection logic
β”‚   β”œβ”€β”€ Dockerfile                      # Collector container
β”‚   └── requirements.txt                # Collector dependencies
β”‚
β”œβ”€β”€ llama.cpp/                          # 🧠 AI/LLM Infrastructure (Git Submodule)
β”‚   β”œβ”€β”€ models/                         # LLM model files (.gguf)
β”‚   └── .devops/                        # Docker configs for llama.cpp
β”‚
β”œβ”€β”€ logs/                               # πŸ“‹ Application Logs
β”‚   └── http_requests.log               # HTTP request logs
β”‚
└── modelfiles/                         # πŸ“„ Model Configuration Files
    └── Modelfile                       # LLM model definitions

Component Overview:

πŸ”„ ETL Pipeline (etl_pipeline/)

  • Purpose: Core data processing engine
  • Technology: Python with Polars, Pydantic, OpenAI API
  • Features: Modular architecture, async processing, AI integration

πŸ‘₯ Client Applications (Clients/)

  • Go Client: High-performance concurrent processing
  • Python Clients: Flexible scripting and prototyping

πŸ“‘ Data Collector (collect/)

  • Purpose: Web scraping and data ingestion
  • Features: Rate limiting, error handling, data validation

🧠 AI Infrastructure (llama.cpp/)

  • Purpose: Local LLM server for sentiment analysis
  • Technology: llama.cpp with CUDA acceleration
  • Models: Gemma, OpenHermes, and other GGUF models

πŸ“‹ Prerequisites

The storage bucket follow the medalian schema as follow

β”œβ”€β”€ Bronze/                       # raw data
β”‚   β”œβ”€β”€ old/                      # the raw data that have been Enriched 
β”‚   β”œβ”€β”€ new/                      # the raw data that waiting to be Enriched
β”œβ”€β”€ Silver/                       # Enriched data
β”‚   β”œβ”€β”€ processed/                # The data that have been enriched and used to generate kpis
β”‚   β”œβ”€β”€ to_process/               # The data that have been enriched and waiting to generate kpis
β”œβ”€β”€ gold/                         # final data
β”‚   

Create 3 Database in supabase that follow these schema :

Database name (user_kpis)

{'id': String, 'average_spent': Float64, 'positive_reviews': UInt32, 'negative_reviews': UInt32, 'likeness_score': Float64, 'normalized_likeness_score': Float64}

Database name (shop_kpis)

{'shop_id': String, 'average_profit': Float64, 'positive_reviews': UInt32, 'negative_reviews': UInt32, 'likeness_score': Float64, 'normalized_likeness_score': Float64})

Database name (date_kpis)

{'date': String, 'average_profit_per_day': Float64}

πŸš€ Getting Started

1. Clone the Repository

git clone --recursive https://github.com/aymen-fkir/AI-Powered-E-commerce-Analytics.git
cd AI-Powered-E-commerce-Analytics

2. Environment Setup

# Copy environment template
cp .example.env .env

# Edit with your credentials
nano .env

Required environment variables:

project_url=https://your-project.supabase.co
project_key=your_supabase_service_key
project_jwt_key=your-supabase-jwt-key
api_key=marcove-api-key

3. Configuration

Edit config.yaml to match your setup:

# Supabase storage paths
supabase:
  bucketName: 'your-bucket-name'
  
# AI model configuration  
ETLCONFIG:
  model: 'gemma-3-1b'
  base_url: 'http://localhost:8000/v1/'

🐳 Running with Docker (Recommended)

Start All Services

# Build and start all containers
docker-compose up -d

# View logs
docker-compose logs -f

Run Individual Services

# LLM server only
docker-compose up llama

# ETL pipeline only  
docker-compose up etl

# Go client only
docker-compose up go-client

# Data collector only
docker-compose up collector

Service Endpoints

πŸ› οΈ Manual Installation

ETL Pipeline

ETL

Go Client

CLIENT

Data Collector

COLLECTOR

llama-cpp

LLAMA-CPP

πŸ”§ Development

Adding New Transformations

  1. Create new module in etl_pipeline/src/etl_pipeline/transform/
  2. Implement transformation logic
  3. Register in main pipeline orchestrator

Extending Client Support

  1. Create new client directory in Clients/
  2. Implement core interfaces (extract, transform, load)
  3. Add Docker configuration

Custom AI Models

  1. Add model files to llama.cpp/models/
  2. Update config.yaml with model configuration
  3. Restart llama service

Note: This project demonstrates modern data engineering practices with AI integration. Perfect for learning microservices architecture, containerization, and ML operations. you can read about my Project in This blog

About

AI-Powered E-commerce Analytics is an end-to-end data engineering platform that processes e-commerce data using artificial intelligence. It extracts data from various sources, transforms it using local LLM-powered sentiment analysis, and generates automated KPIs for business insights.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published