Legal AI Research Lab

Kaipsul

Kaipsul is an independent AI research lab. Our current project focuses on context engineering.

Important Notice: MIRC (Research Preview)

Memory-Isolated Recursive Compression ("MIRC") utilizes algorithmic segmentation and probabilistic compression. Output constitutes a lossy representation, not a verbatim reproduction. Users must independently verify all output against original source text. MIRC is provided strictly for research and evaluation purposes. Do not rely on compressed output for legal filings, evidentiary support, or factual verification.

Research Findings GitHub

MIRC

Memory-Isolated Recursive Compression for document pre-processing

What is MIRC?

MIRC is a document pre-processing tool that segments large texts into memory-isolated chunks, compresses them in parallel using on-device AI, and then reconstructs the compressed output.

It is designed for downstream AI processing. It aims to preserve semantic structure while reducing token count.

The Process

1.
Chunking: Segment document into predefined memory chunks
2.
Compression: Process each chunk independently using on-device AI
3.
Reconstruction: Concatenate processed segments into a unified file
4.
Downstream Integration: Output is formatted for LLM inference

Document Length and AI Performance

Why compression matters for downstream processing

Technical Challenge

As documents get longer, language models distribute attention across exponentially more tokens. This dilutes attention mechanisms, degrading retrieval accuracy and instruction following.

Many documents often exceed practical limits. Even when technically processable, performance degrades with length.

Compression as Pre-processing

MIRC increases signal density. By mathematically reducing token count while retaining semantic pointers, it allows downstream AI models to allocate attention resources more effectively.

Designed for AI Systems: Compressed output serves as an intermediary format. Large documents are compressed into a dense format, enabling inference by systems that would otherwise be constrained by context window limits.

Research Findings

Empirical results from MIRC implementation on benchmark documents

Supreme Court Opinion

SFFA v. Harvard

84.2% Compression

483K -> 76K Characters

162 Chunks

Compressed Original

Federal Legislation

Consolidated Appropriations Act, 2018

83.7% Compression

848K -> 138K Characters

284 Chunks

Compressed Original

Federal Legislation

One Big Beautiful Bill Act (2025)

84.7% Compression

330K -> 51K Characters

111 Chunks

Compressed Original

Special Counsel Report

Mueller Report Volume II

86.5% Compression

622K -> 84K Characters

208 Chunks

Compressed Original

Supreme Court Opinion

Dobbs v. Jackson

85.6% Compression

429K -> 62K Characters

144 Chunks

Compressed Original

Implementation

Reference implementation for macOS

MIT License - Open Source

Research Preview (v0.1.0)

This is a Swift implementation using Apple's Foundation Models framework. It utilizes on-device processing with parallel compression via actor-based concurrency.

Requirements: Apple Silicon (M1+) - macOS 26.0+ - Apple Intelligence

GitHub Repository