Case Study — Fine-tuned LLM

Legal AI Assistant —
Pakistan Penal Code Q&A

A domain-specific legal AI: Llama 3.2 8B fine-tuned with Unsloth LoRA on all 511 sections of the Pakistan Penal Code, served as a REST API on Hugging Face Spaces, and consumed by a React chat UI that renders structured Law Reference and Punishment cards per query.

Fine-tuned LLMLegal AI

Build a Similar System Back to Projects

System Pipeline

⚖️

INPUT

Legal Query

→

💬

STEP 1

React Chat UI

→

🤗

STEP 2

HF Spaces API

→

🧠

STEP 3

Llama 3.2 8B

→

🔍

STEP 4

Response Parser

→

📋

OUTPUT

Law + Punishment

511PPC Sections Trained On

8BModel Parameters

LoRAFine-Tuning Method

HFSpaces Deployment

Overview

Making Legal Knowledge Accessible with AI

Understanding Pakistani law requires navigating 511 sections of the Pakistan Penal Code — dense, archaic language most citizens cannot practically interpret. Consulting a lawyer for everyday legal questions is expensive and inaccessible. A general-purpose LLM gives vague or hallucinated answers on specific PPC sections. The solution was a domain-specific fine-tuned model that knows the law precisely, served through a clean chat interface.

The Problem

Legal Knowledge Gap

511 PPC sections in Victorian-era legal language — inaccessible to most citizens
General LLMs hallucinate section numbers or confuse Pakistani and Indian Penal Codes
Legal consultation is expensive and out of reach for most Pakistanis
No conversational interface existed to query PPC sections in plain language

The Solution

Fine-Tuned Model + Structured Chat UI

Llama 3.2 8B was fine-tuned with Unsloth LoRA directly on the complete PPC — teaching it to always return a structured response with a clear Law Reference and Punishment. The model is deployed as a REST API on Hugging Face Spaces; a React chat frontend parses each response and renders two distinct cards: a blue Law Reference card and a red Punishment card.

System Architecture

Two-Part System: Fine-Tuned API + React Frontend

The system is split into two independently deployed parts: a fine-tuned model served on Hugging Face Spaces as a REST endpoint, and a React chat application on Vercel that calls that endpoint and renders structured legal responses.

Chat UIReact + TypeScript SPA built with Vite; shadcn/ui component library; Tailwind CSS for styling. Deployed on Vercel. TanStack React Query manages async API state; React Router for navigation.ReactTypeScriptViteshadcn/uiTailwind CSSVercel

API RequestUser message sent as POST to the HF Spaces endpoint with JSON body: {"messages": [{"role": "user", "content": "..."}]}. Response fields checked in order: data.reply → data.response → raw string → fallback error message.REST APIJSONReact QueryFetch

Model APIFine-tuned Llama 3.2 8B served as a FastAPI endpoint on Hugging Face Spaces (mohsin65-ppc-legal-assistant-api.hf.space/chat). Handles conversational legal queries against the Pakistan Penal Code knowledge baked into its weights via LoRA fine-tuning.Hugging Face SpacesFastAPILlama 3.2 8BUnsloth LoRA

Fine-TuningLlama 3.2 8B fine-tuned using Unsloth QLoRA (4-bit quantisation, rank-16 LoRA adapters on attention projections) on a structured instruction dataset built from all 511 PPC sections. Trained to output consistent Law Reference + Punishment format.UnslothLoRAQLoRA4-bit QuantisationGGUF

Response RenderingMessageBubble component runs parseAIResponse() — regex pattern matching extracts "Law Reference" and "Punishment" boundaries from model output. Matched responses render as two styled cards: book icon + blue border for Law Reference, gavel icon + red accent for Punishment.Regex ParsingConditional Renderingshadcn/ui Cards

Key Features

Precision Legal Q&A with Structured Output

Full PPC Coverage — All 511 sections of the Pakistan Penal Code embedded in the model's fine-tuned weights — from Section 1 (Title and Extent) through Section 511 (Attempt to Commit Offences).

Structured Law Reference Card — Every AI response that matches a PPC section renders a dedicated blue-bordered card with a book icon showing the extracted Law Reference — separated from the punishment text.

Structured Punishment Card — The punishment portion of each response is rendered in a distinct red-accented card with a gavel icon — making legal consequences immediately scannable.

Unsloth LoRA Fine-Tuning — 4-bit QLoRA via Unsloth reduces VRAM requirement by ~70% vs full fine-tuning; rank-16 LoRA adapters on q/k/v/o projections achieve high domain accuracy with minimal compute.

Hugging Face Spaces Deployment — Model served as a REST API on HF Spaces — no GPU infrastructure to manage; accessible over HTTPS from any frontend without authentication overhead.

Robust Response Parsing — Frontend parseAIResponse() uses two regex patterns in sequence: primary pattern matches full "Section … Punishment:" block; fallback locates "Section [number]" — plain text returned when neither matches.

Multi-turn Conversation — Chat state maintained in React with full message history; each subsequent query can reference prior context for follow-up legal questions.

Auto-scroll & Loading States — useRef + useEffect auto-scroll to newest messages; loading indicator displayed during API calls; toast notifications on errors — all built with shadcn/ui.

Technical Highlights

Key Engineering Decisions

Why Fine-Tune vs RAG

Weights vs Retrieval for Legal Accuracy

RAG was evaluated but rejected: PPC section text is highly cross-referenced, and retrieval frequently fetches the wrong section when queries use colloquial language ("what happens if I steal" vs "Section 379"). Fine-tuning bakes the plain-language → section mapping directly into Llama's weights via LoRA, eliminating the retrieval error surface at the cost of a fixed training corpus. The structured output format is enforced at training time, not at prompt time.

Frontend Response Parsing

Regex Pattern Matching on Model Output

The React parseAIResponse()function uses two sequential regex patterns to extract structure from the model's free-text output. The primary pattern matches the full "Section … through … Punishment:" block. If that fails, a fallback pattern locates "Section [number]" alone. This two-tier approach handles both well-formatted and partial model outputs without crashing the UI — unstructured responses fall through to plain text display.

Technology Stack

Full Stack: Fine-Tuning to Frontend

ReactTypeScriptViteTailwind CSSshadcn/uiTanStack React QueryReact RouterLlama 3.2 8BUnslothLoRA / QLoRAHugging Face SpacesFastAPIGGUFVercel

Built By

Development Team

Mohsin Sabir

Developer

Fatima Abbas

Developer

Need a domain-specific AI model fine-tuned for your field?

We fine-tune LLMs on proprietary datasets — legal, medical, financial, technical — and build the frontend to make them usable. From training to deployment.

Start a conversation View all projects

Legal AI Assistant —
Pakistan Penal Code Q&A

Fine-tuned LLMLegal AI

Making Legal Knowledge Accessible with AI

The Problem

Legal Knowledge Gap

511 PPC sections in Victorian-era legal language — inaccessible to most citizens
General LLMs hallucinate section numbers or confuse Pakistani and Indian Penal Codes
Legal consultation is expensive and out of reach for most Pakistanis
No conversational interface existed to query PPC sections in plain language

The Solution

Fine-Tuned Model + Structured Chat UI

Two-Part System: Fine-Tuned API + React Frontend

Precision Legal Q&A with Structured Output

Structured Punishment Card — The punishment portion of each response is rendered in a distinct red-accented card with a gavel icon — making legal consequences immediately scannable.

Hugging Face Spaces Deployment — Model served as a REST API on HF Spaces — no GPU infrastructure to manage; accessible over HTTPS from any frontend without authentication overhead.

Multi-turn Conversation — Chat state maintained in React with full message history; each subsequent query can reference prior context for follow-up legal questions.

Auto-scroll & Loading States — useRef + useEffect auto-scroll to newest messages; loading indicator displayed during API calls; toast notifications on errors — all built with shadcn/ui.

Key Engineering Decisions

Why Fine-Tune vs RAG

Weights vs Retrieval for Legal Accuracy

Frontend Response Parsing

Legal AI Assistant —Pakistan Penal Code Q&A

Making Legal Knowledge Accessible with AI

Legal Knowledge Gap

Fine-Tuned Model + Structured Chat UI

Two-Part System: Fine-Tuned API + React Frontend

Precision Legal Q&A with Structured Output

Key Engineering Decisions

Weights vs Retrieval for Legal Accuracy

Regex Pattern Matching on Model Output

Full Stack: Fine-Tuning to Frontend

Development Team

Need a domain-specific AI model fine-tuned for your field?

Legal AI Assistant —Pakistan Penal Code Q&A

Making Legal Knowledge Accessible with AI

Legal Knowledge Gap

Fine-Tuned Model + Structured Chat UI

Two-Part System: Fine-Tuned API + React Frontend

Precision Legal Q&A with Structured Output

Key Engineering Decisions

Weights vs Retrieval for Legal Accuracy

Regex Pattern Matching on Model Output

Full Stack: Fine-Tuning to Frontend

Development Team

Need a domain-specific AI model fine-tuned for your field?

Legal AI Assistant —
Pakistan Penal Code Q&A

Legal AI Assistant —
Pakistan Penal Code Q&A