SUM

GAINS — Gesture-Assisted Intelligent Note Scribe

Desktop note-taking where hand gestures control navigation while voice captures content. Tauri shell with Python services, MediaPipe hand tracking, ZMQ bridge.

In development Verified Apr 27, 2026

GAINS — Gesture-assisted note-taking interface

A multimodal note-taking system that lets you control document creation through hand gestures, voice, and AI. Point to scroll, pinch to select, swipe to navigate — while speaking your notes naturally.

What Makes It Different

Most note-taking tools optimize for keyboard input. GAINS is designed for situations where your hands are busy or a keyboard isn’t practical — standing meetings, lab work, cooking, workshops. Gestures handle navigation and structure while voice captures content.

Gesture Recognition

Hand tracking uses MediaPipe Hands, with the recognizer running as a Python service. A standard webcam is all that’s required.

Voice-to-Structure

Speech isn’t just transcribed — it’s parsed into structured sections, action items, and key points. The gesture layer controls where and how content is placed.

Service Architecture

A ZMQ bridge decouples the desktop shell from the Python inference services, so gesture recognition, speech processing, and summarization each run as independent workers. Tests covering the ZMQ bridge and sprint-level service integration live next to the services.

Technical Stack

  • Shell: Tauri desktop app (cross-platform)
  • Services: Python microservices for gesture, speech, and summarization
  • Transport: ZeroMQ bridge between shell and services
  • Gesture: MediaPipe Hands
  • Voice: ASR + structured summarization via LLM
  • Layout: GAINS/ app core · services/ Python workers · tauri-app/ desktop shell

Repository README

GAINS — Gesture-Assisted Intelligent Note Scribe

GAINS is an innovative note-taking application that leverages gesture recognition to enhance user interaction and productivity. It integrates advanced AI algorithms to interpret user gestures, allowing for a seamless and intuitive note-taking experience across multiple platforms.