SUM EQUITIES

InfiniteContext — Tiered Memory for AI

An extensible memory layer that pushes past model context limits: a hierarchical bucket store with exact-flat persistence and approximate (HNSW) retrieval at scale, automatic categorization, multi-level summarization, and pluggable storage tiers from local disk to cloud. Public, MIT-licensed TypeScript library + CLI.

In development last commit 5 days ago · 50 commits / 30d Verified Jun 15, 2026

InfiniteContext architecture — Context Manager and Memory Router route content across local/cloud storage into an HNSW vector store, with categorization, summarization, and integrity utilities

Status: public, MIT-licensed, early-stage (v0.1.0). The library is inspectable and exercised by a Jest test suite with CI on Node 20/22, but it’s pre-1.0 with no stable release. It’s distributed as source on GitHub — there is no published npm package under this name (that name on the registry belongs to an unrelated author). The “What’s Needed To Tighten” section is the honest gap list.

InfiniteContext is a TypeScript memory layer that lets AI applications work past the context window — storing, organizing, and retrieving large amounts of information across storage tiers instead of cramming everything into one prompt. It ships as a Node library and a CLI (infinite-context), with OpenAI wired in for embeddings and summarization.

Three Hierarchies

The system layers at three distinct levels:

  1. Storage tiers — a Memory Router fronts pluggable providers (local disk today, Google Drive via the cloud provider, with a common StorageProvider interface for adding more), so hot and cold memory live where they belong.
  2. Multi-level summarization — a SummarizationEngine compresses content into summaries at several levels of abstraction, so long histories can be served without replaying everything verbatim.
  3. Hierarchical bucket organization — content is filed into a domain/topic bucket tree and retrieved through a HierarchicalRetriever, with automatic categorization deciding where new content lands.

What’s Implemented Today

  • Vector retrieval — exact flat index artifacts that persist and support rebuild / merge / split / save / load, plus an approximate HNSW backend (hnswlib-node, with graceful flat fallback) that the hierarchical retriever switches to per-trace above a ~2,000-item threshold for sub-linear search at scale. A committed, key-free benchmark (docs/retrieval-benchmark.md) measures the HNSW backend against exact flat search: recall@10 = 0.995 at N=1,000 (≈1.3× p50 speedup), scaling to ~24× p50 speedup at 100k vectors — i.e. near-exact results at a fraction of the latency.
  • Hierarchical bucket store — domain→topic organization with a multi-strategy retriever, covered by tests.
  • Automatic categorization — a PromptCategorizer routes content using keyword, vector-similarity, and adaptive strategies, with a category cache.
  • Multi-level summarization — OpenAI-driven summaries at configurable abstraction levels (real chat.completions calls), with a deterministic extractive fallback when no LLM client is configured.
  • Negation-aware retrieval — Widdows orthogonal negation, so “X but not Y” queries subtract the negated subspace (with a design note connecting it to compositional/DisCoCat meaning).
  • Pluggable storage — local and Google Drive providers behind one interface.
  • Robustness layer — transaction management with rollback, data-integrity verification, backup/recovery, and data portability (JSON / JSONL / CSV import-export).

Technical Stack

  • Language: TypeScript (Node library + infinite-context CLI)
  • Vector index: exact flat (persisted artifacts) + approximate HNSW via hnswlib-node for retrieval at scale (IVF reserved)
  • Embeddings / summarization: OpenAI (text-embedding-3-small by default)
  • Storage: local filesystem, Google Drive; SQLite / MongoDB / Redis client deps present for backends
  • Tests / CI: Jest, GitHub Actions on Node 20 + 22
  • License: MIT
  • Docs: API, Architecture, Categorization, Compositional Meaning, Memory Monitoring, Extending, llamafile, macOS setup

What’s Needed For This Entry To Tighten

  • A tagged stable release (it’s at v0.1.0). An install path already exists if wanted: the bare npm name infinite-context is taken by another author, but the package’s actual scoped name @ototao/infinite-context is unpublished (free) — so shipping to npm is a choice, not a blocker.
  • A hosted demo or a recorded end-to-end run (ingest → store → retrieve) a visitor can verify without cloning.
  • A labeled retrieval-quality eval. The committed benchmark above measures HNSW’s fidelity to exact search (approximation quality + speedup), not end-to-end answer quality against ground-truth relevance — that’s the next benchmark to add.
  • Shipped

    Exact flat vector store with persisted, rebuildable index artifacts (save/load/merge/split), plus an approximate HNSW backend for retrieval at scale

    src/core/VectorStore.ts (persisted exact flat artifacts), src/utils/IndexManager.ts (flat + hnswlib-node HNSW, graceful flat fallback), src/core/HierarchicalRetriever.ts (per-trace HNSW ANN above a 2,000-item threshold); tests/core/VectorStore.test.ts

  • Shipped

    Hierarchical bucket memory with a multi-strategy retriever

    src/core/{Bucket,MemoryManager,HierarchicalRetriever}.ts; tests/core/{Bucket,HierarchicalRetriever}.test.ts

  • Shipped

    Automatic prompt/output categorization with keyword, vector-similarity, and adaptive strategies

    src/categorization/PromptCategorizer.ts + strategies/

  • Shipped

    Multi-level summarization via OpenAI embeddings/completions

    src/summarization/SummarizationEngine.ts

  • Shipped

    Negation-aware retrieval (Widdows orthogonal negation)

    src/core/NegationAwareRetrieval.ts; tests/core/NegationAwareRetrieval.test.ts

  • Shipped

    Pluggable storage tiers — local disk and Google Drive behind a common provider interface

    src/providers/{StorageProvider,LocalStorageProvider,GoogleDriveProvider}.ts

  • Shipped

    Continuous integration on Node 20 and 22 (build + test)

    .github/workflows CI; Jest suite under tests/

Repository README

InfiniteContext

An extensible memory architecture providing virtually unlimited context for AI systems.

Overview

InfiniteContext is a TypeScript library that provides a structured way to store, organize, and retrieve large amounts of contextual information across different storage tiers. It's designed to solve the context limitation problem for AI systems by creating a robust, hierarchical memory architecture that seamlessly scales from local to cloud storage.

Key Features

  • Hierarchical Bucket System: Organize information by domain and topic in a hierarchical structure
  • Automatic Categorization: Intelligently categorize prompts and outputs into appropriate buckets
  • Tiered Storage Architecture: Use different storage providers from local disk to cloud services
  • Vector-Based Retrieval: Efficient semantic search across all stored information
  • Multi-Level Summarization: Automatic generation of summaries at different levels of abstraction
  • Extensible Storage: Add custom storage providers to integrate with any system
  • OpenAI Integration: Ready-to-use integration with OpenAI for embeddings and summarization
  • Robust Error Handling: Comprehensive error handling with custom error types and recovery mechanisms
  • Data Integrity: Verification and repair of data integrity to prevent corruption
  • Backup & Recovery: Automated backup and recovery of stored data
  • Data Portability: Export and import data in various formats (JSON, JSONL, CSV)
  • Flat Vector Index Artifacts: Exact, persisted vector indices for rebuild, merge, split, save, and load workflows
  • Transaction Management: Atomic operations with rollback capability

Installation

npm install @ototao/infinite-context

Basic Usage

import { InfiniteContext } from '@ototao/infinite-context';
import { OpenAI } from 'openai';

// Create an OpenAI client for embeddings and summarization
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

// Initialize the system
const context = new InfiniteContext({
  openai,
  embeddingModel: 'text-embedding-3-small'
});

await context.initialize();

// Store content in a bucket
const chunkId = await context.storeContent(
  'InfiniteContext provides virtually unlimited memory for AI systems through distributed storage.',
  {
    bucketName: 'documentation',
    bucketDomain: 'product',
    metadata: {
      source: 'readme',
      tags: ['documentation', 'overview']
    }
  }
);

// Retrieve relevant content
const results = await context.retrieveContent('How does InfiniteContext work?');

for (const { chunk, score } of results) {
  console.log(`Score: ${score.toFixed(3)}`);
  console.log(`Content: ${chunk.content}`);
}

// Generate summaries
const summaries = await context.summarize(longText, { levels: 3 });

Architecture

The system is designed with a modular architecture:

graph TD
    A[User Input] --> B[Context Manager]
    B --> C[Memory Router]
    C --> D[Local Storage]
    C --> E[Cloud Storage]
    C --> F[External Platforms]
    D --> G[Vector Store]
    E --> G
    F --> G
    G --> H[Retrieval Engine]
    H --> I[Response Generator]
    I --> J[User Output]
    K[Summarization Engine] --> G
    B --> K
    
    %% Categorization system
    R[Prompt Categorizer] --> B
    R --> S[Keyword Strategy]
    R --> T[Vector Strategy]
    R --> U[Adaptive Strategy]
    V[Category Cache] -.-> R
    
    %% Other utilities
    L[Error Handler] -.-> B
    M[Transaction Manager] -.-> C
    N[Integrity Verifier] -.-> G
    O[Backup Manager] -.-> C
    P[Data Portability] -.-> C
    Q[Index Manager] -.-> G

Core Components

  • MemoryManager: The main entry point that coordinates all components
  • Bucket: Organizes chunks into domains and hierarchies
  • VectorStore: Handles storage and retrieval of embeddings
  • StorageProvider: Interface for different storage systems
  • SummarizationEngine: Generates summaries at different levels
  • PromptCategorizer: Automatically categorizes prompts and outputs into appropriate buckets

Utility Components

  • ErrorHandler: Comprehensive error handling with custom error types
  • TransactionManager: Atomic operations with rollback capability
  • IntegrityVerifier: Verification and repair of data integrity
  • BackupManager: Automated backup and recovery of stored data
  • DataPortability: Export and import data in various formats
  • IndexManager: Exact flat-index artifact management; approximate HNSW/IVF backends are reserved for a future release and fail explicitly if requested

Vector Index Support

InfiniteContext's first functional release does not support approximate vector indexing. The built-in VectorStore performs exact in-memory search and persists two files when saved:

  • <path>.json for chunk payloads
  • <path>.index.json for a real flat index artifact containing indexed ids, positions, embeddings, and chunks

IndexManager supports exact flat index rebuild, merge, split, save, and load artifact workflows. IndexType.HNSW and IndexType.IVF remain reserved enum values for future backends; operations using those types return false or throw through validation paths instead of reporting fake success.

For a quantified comparison of exact flat search against an approximate HNSW index (recall@k and p50/p95 latency at 1k/10k/100k synthetic vectors), see docs/retrieval-benchmark.md. Run it with npm run bench:retrieval.

Storage Providers

InfiniteContext comes with built-in providers:

  • LocalStorageProvider: Uses the local filesystem
  • GoogleDriveProvider: Stores data in Google Drive

You can add custom providers by implementing the StorageProvider interface.

Advanced Features

Google Drive Integration

const context = new InfiniteContext({
  openai,
  embeddingModel: 'text-embedding-3-small'
});

await context.initialize({
  addGoogleDrive: true,
  googleDriveCredentials: {
    clientId: process.env.GOOGLE_CLIENT_ID!,
    clientSecret: process.env.GOOGLE_CLIENT_SECRET!,
    redirectUri: 'http://localhost:3000/oauth2callback',
    refreshToken: process.env.GOOGLE_REFRESH_TOKEN!
  }
});

Custom Embedding Function

const context = new InfiniteContext({
  basePath: './custom-storage',
  embeddingFunction: async (text) => {
    // Your custom embedding implementation
    // Must return a Vector (number[])
  }
});

Backup and Recovery

// Create a backup of all data
const backup = await context.createBackup({
  includeVectorStores: true,
  maxBackups: 5 // Keep only the 5 most recent backups
});

console.log(`Backup created: ${backup.id}`);

// List available backups
const backups = await context.listBackups();
console.log(`Available backups: ${backups.length}`);

// Recover from a backup
const recovered = await context.recoverFromBackup({
  backupId: backups[0].id,
  overwriteExisting: false
});

if (recovered) {
  console.log('Recovery successful');
}

Data Portability

// Export chunks to a file
const exportResult = await context.exportChunks(chunks, {
  format: 'json',
  outputPath: './export/data.json',
  compress: true,
  includeEmbeddings: true
});

console.log(`Exported ${exportResult.count} chunks to ${exportResult.path}`);

// Import chunks from a file
const importResult = await context.importChunks({
  inputPath: './export/data.json.gz',
  bucketName: 'imported',
  bucketDomain: 'external',
  decompress: true
});

console.log(`Imported ${importResult.succeeded} chunks`);

Data Integrity

// Verify the integrity of a chunk
const verificationResult = await context.verifyChunkIntegrity(chunk, storedHash);

if (verificationResult.isValid) {
  console.log('Chunk is valid');
} else {
  console.log(`Chunk integrity issues: ${verificationResult.errors.length}`);
  
  // Try to repair the chunk
  const repairedChunk = await context.repairChunk(chunk, verificationResult);
  
  if (repairedChunk) {
    console.log('Chunk repaired successfully');
  } else {
    console.log('Chunk could not be repaired');
  }
}

Automatic Categorization

// Initialize with categorizer enabled
const context = new InfiniteContext({
  openai,
  categorizerOptions: {
    cacheSize: 1000,
    enableLearning: true
  }
});

await context.initialize({
  initializeCategorizer: true
});

// Store a prompt and output with automatic categorization
const prompt = 'Explain how JavaScript promises work.';
const output = 'JavaScript promises are objects that represent...';

const chunkId = await context.storePromptAndOutput(prompt, output, {
  metadata: {
    source: 'user-question',
    tags: ['javascript', 'async']
  }
});

// Override automatic categorization with manual bucket assignment
const overrideId = await context.storePromptAndOutput(
  'What are the best practices for data visualization?',
  'When creating data visualizations, follow these best practices...',
  {
    // Manual override provides feedback to improve future categorizations
    overrideBucket: {
      name: 'visualization',
      domain: 'data'
    }
  }
);

// Update the categorizer when adding new buckets
await context.updateCategorizer();

For more details, see the Categorization documentation.

Flat Vector Index Artifacts

// The first functional release always recommends the supported exact flat index.
const params = await context.getOptimalIndexParams(
  chunks.length,
  chunks[0].embedding.length
);

console.log(`Supported index type: ${params.type}`); // "flat"

// Estimate memory usage for the flat index.
const memoryUsage = await context.estimateIndexMemoryUsage(params, chunks.length);
console.log(`Estimated memory usage: ${memoryUsage / (1024 * 1024)} MB`);

// Rebuild writes an actual JSON artifact. HNSW/IVF requests are rejected until
// approximate backends are implemented end-to-end.
const rebuilt = await context.rebuildIndex(chunks, params, './vector-indices/chunks.flat-index.json');
if (!rebuilt) {
  throw new Error('Index rebuild failed');
}

Setup Guide for macOS

This comprehensive guide will walk you through setting up InfiniteContext on macOS. For more detailed instructions, see the macOS Setup Guide.

Prerequisites

  1. macOS Requirements

    • macOS 10.15 (Catalina) or newer
    • At least 4GB of RAM (8GB+ recommended)
    • 1GB of free disk space
  2. Required Software

    • Node.js (v16.x or newer)
    • npm (v7.x or newer)
    • Git
  3. API Keys

    • OpenAI API key (for embeddings and summarization)
    • Google API credentials (optional, for Google Drive integration)

Installation Steps

Step 1: Install Node.js and npm

If you don't have Node.js installed:

# Using Homebrew (recommended)
brew install node

# Verify installation
node --version  # Should be v16.x or newer
npm --version   # Should be v7.x or newer

If you prefer to manage multiple Node.js versions:

# Install nvm (Node Version Manager)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.3/install.sh | bash

# Install and use Node.js v18
nvm install 18
nvm use 18
Step 2: Clone the Repository
# Clone the repository
git clone https://github.com/OtotaO/InfiniteContext.git
cd InfiniteContext

# Install dependencies
npm install
Step 3: Configure Environment Variables

Create a .env file in the project root:

touch .env

Add your API keys and configuration:

# OpenAI API Key (required for embeddings and summarization)
OPENAI_API_KEY=your_openai_api_key_here

# Google Drive Integration (optional)
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secret
GOOGLE_REDIRECT_URI=http://localhost:3000/oauth2callback
GOOGLE_REFRESH_TOKEN=your_google_refresh_token

# Storage Configuration (optional)
STORAGE_BASE_PATH=~/infinite-context-data
Step 4: Build the Project
# Build the TypeScript code
npm run build
Step 5: Run Basic Tests
# Run tests to ensure everything is working
npm test

Advanced Configuration

Google Drive Integration

To set up Google Drive integration:

  1. Create a Google Cloud project at https://console.cloud.google.com/
  2. Enable the Google Drive API
  3. Create OAuth 2.0 credentials
  4. Run the authorization flow to get a refresh token:
# Create a script to get the refresh token
cat > get-google-token.js << 'EOF'
import { OAuth2Client } from 'google-auth-library';
import http from 'http';
import url from 'url';
import open from 'open';
import dotenv from 'dotenv';

dotenv.config();

const CLIENT_ID = process.env.GOOGLE_CLIENT_ID;
const CLIENT_SECRET = process.env.GOOGLE_CLIENT_SECRET;
const REDIRECT_URI = 'http://localhost:3000/oauth2callback';
const SCOPES = ['https://www.googleapis.com/auth/drive.file'];

const oauth2Client = new OAuth2Client(CLIENT_ID, CLIENT_SECRET, REDIRECT_URI);

async function getRefreshToken() {
  const authUrl = oauth2Client.generateAuthUrl({
    access_type: 'offline',
    scope: SCOPES,
    prompt: 'consent'
  });

  console.log('Authorize this app by visiting this URL:', authUrl);
  await open(authUrl);

  return new Promise((resolve, reject) => {
    const server = http.createServer(async (req, res) => {
      try {
        const queryParams = url.parse(req.url, true).query;
        const code = queryParams.code;
        
        if (code) {
          res.writeHead(200, {'Content-Type': 'text/html'});
          res.end('Authentication successful! You can close this window.');
          
          const {tokens} = await oauth2Client.getToken(code);
          console.log('Refresh token:', tokens.refresh_token);
          
          server.close();
          resolve(tokens.refresh_token);
        }
      } catch (e) {
        reject(e);
      }
    }).listen(3000);
  });
}

getRefreshToken().catch(console.error);
EOF

# Install required packages
npm install google-auth-library open

# Run the script
node get-google-token.js
  1. Add the refresh token to your .env file
Memory Monitoring

Enable memory monitoring to get alerts when storage thresholds are reached:

await context.initialize({
  enableMemoryMonitoring: true,
  memoryMonitoringConfig: {
    bucketSizeThresholdMB: 100,
    providerCapacityThresholdPercent: 80,
    monitoringIntervalMs: 60000 // Check every minute
  }
});

// Add a custom alert handler
context.addMemoryAlertHandler((alert) => {
  console.log(`ALERT: ${alert.message}`);
  // Send notification, email, etc.
});

Troubleshooting

Common Issues
  1. "Error: No embedding function available"

    • Make sure you've provided a valid OpenAI API key
    • Check that the embedding model is available in your OpenAI account
  2. "Error: EACCES: permission denied"

    • Check file permissions in your storage directory
    • Run with sudo (not recommended) or adjust permissions
  3. "Error: Cannot find module"

    • Ensure you've run npm install and npm run build
    • Check import paths in your code
Debugging

Enable debug logging by setting the DEBUG environment variable:

DEBUG=infinite-context:* node your-script.js

Maintenance

Backups

Create regular backups of your data:

// Create a backup
const backup = await context.createBackup({
  backupPath: '/Users/yourusername/backups',
  includeVectorStores: true
});

console.log(`Backup created: ${backup.id}`);
Updates

Keep the library updated:

# Pull latest changes
git pull

# Update dependencies
npm update

# Rebuild
npm run build

License

MIT

Related work