Skip to main content

BaseTone

Voice-Powered AI Metaverse on Base

AIBase Batch India

About this project

The problem it solves

The Problem BaseTone Solves

  • Reduces Complexity of Web3 Interactions:
    • Interacting with the onchain world can be intimidating due to complex UIs, wallet addresses, and gas fee management.
    • BaseTone introduces a voice-first interaction model, allowing users to execute onchain actions using natural language commands (e.g., "Send 0.1 ETH to deepak.base.eth," "What's the price of Bitcoin?").
  • Enhances User Experience:
    • Instead of navigating multiple dApps, users interact within a single, engaging 2D pixel-art metaverse (the "Bharat Bazaar").
    • This makes discovering and utilizing Web3 functionalities more intuitive, gamified, and enjoyable.
  • Simplifies Onchain Identity & Transactions:
    • Integration with Coinbase Smart Wallet concepts and support for ENS-like names (e.g., yourname.base.eth) makes transactions user-friendly by abstracting away long hexadecimal addresses.
  • Provides Contextual AI Assistance:
    • The AI agent is aware of the user's location within the metaverse (e.g., which "stall" they are near), enabling more relevant and efficient assistance for specific onchain tasks on the Base network.

In essence, BaseTone aims to make "being onchain" feel as natural and effortless as having a conversation, transforming complex blockchain operations into simple voice commands within a fun, interactive virtual space.

Challenges we ran into

Challenges Encountered During Development

  • Real-time Voice Interaction Pipeline:
    • Obstacle: Achieving a low-latency, seamless flow for Speech-to-Text (STT), AI agent processing (OpenAI + AgentKit), and Text-to-Speech (TTS) without breaking immersion.
    • Solution: Optimized API calls, utilized streaming for TTS via a backend proxy, managed frontend state clearly (listening, processing, speaking), and orchestrated the flow with the useVoiceInteraction custom hook.
  • AgentKit Tool Reliability & Context Management:
    • Obstacle: Ensuring the AI agent reliably understood user intent, selected the correct AgentKit tool, and provided accurate parameters, avoiding hallucinations.
    • Solution: Iteratively refined system prompts, provided clear tool descriptions and parameter schemas (with Zod validation), implemented a "clarify_or_get_more_info" fallback tool, and used stall-specific context (agentFeatureHint) to guide the agent.
  • Multi-Component Architecture & State Synchronization:
    • Obstacle: Managing and synchronizing state (player positions, emotes, room data) consistently across the Next.js frontend, AI agent backend, and Socket.IO multiplayer backend.
    • Solution: Employed a centralized Socket.IO server for multiplayer state, established clear API contracts between frontend and AI backend, and used React Context (SocketContext) for client-side socket management.

About the founders

Building on Base from India

Technologies and tags

Node.jsTailwind CSSReact.jsTypeScriptExpress.jsOpenAiSocket.IOSpeech APIText-to-Speech