tarachat

TaraChat - RAG Chatbot with CroissantLLM

A full-stack chatbot application implementing Retrieval-Augmented Generation (RAG) using CroissantLLM, a French-optimized language model.

Features

Quick Start

# Start the application
docker-compose up --build

# Wait for "RAG system initialized successfully" in logs
docker-compose logs -f backend

Access the application:

First startup takes 5-15 minutes to download the CroissantLLM model (~3GB).

Documentation

Prerequisites

Technology Stack

Backend

Frontend

Project Structure

tarachat/
├── backend/           # FastAPI backend
│   ├── app/          # Application code
│   ├── scripts/      # Utility scripts
│   └── data/         # Sample documents
│
├── frontend/         # React frontend
│   └── src/         # Source code
│       └── components/  # React components
│
└── docker-compose.yml  # Orchestration

Key Commands

# Start application
docker-compose up -d

# View logs
docker-compose logs -f

# Stop application
docker-compose down

# Clean everything (including data)
docker-compose down -v

Or use the Makefile:

make up           # Start
make logs         # View logs
make down         # Stop
make clean        # Clean all
make ingest-docs  # Ingest documents from data/documents/
make list-docs    # List all documents in vector store

Example Usage

Chat with the Bot

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Parle-moi de Paris"}'

Upload a Document

curl -X POST http://localhost:8000/documents \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Your document text here",
    "metadata": {"title": "Example"}
  }'

Development

Backend

cd backend
poetry install
poetry run uvicorn app.main:app --reload

Frontend

cd frontend
npm install
npm run dev

How RAG Works

  1. Documents are chunked and converted to embeddings
  2. User queries retrieve relevant document chunks
  3. CroissantLLM generates responses using retrieved context
  4. Sources are cited with each response

Troubleshooting

Model takes too long to load: First download takes 10-15 minutes

Out of memory: Ensure 8GB RAM available, close other applications

Connection refused: Wait for both containers to start, check logs

Frontend can’t connect: Verify both services running with docker-compose ps

Resources

License

This project is provided as-is for educational and demonstration purposes.