Quick Start

Get Tokenlay running in under 5 minutes. Track costs, enforce limits, and route models without changing your existing AI code.

Create your account

Generate your Tokenlay key

Once logged in, generate your API key from the dashboard. You’ll need this to authenticate with Tokenlay’s proxy.

Install the SDK

Pick your preferred language and install the Tokenlay SDK alongside the OpenAI SDK.

npm install @tokenlay/sdk openai

Update your code

Replace your existing AI client with Tokenlay’s drop-in replacement. Your API calls remain exactly the same.

import { TokenlayOpenAI } from "@tokenlay/sdk";
 
const client = new TokenlayOpenAI({
  providerApiKey: process.env.OPENAI_API_KEY,
  tokenlayKey: process.env.TOKENLAY_KEY,
});
 
// Everything else stays the same
const response = await client.chat.completions.create({
  model: "gpt-3.5-turbo",
  messages: [{ role: "user", content: "Hello!" }],
});

Send your first request

Run your updated code. Once Tokenlay receives your requests, you’ll see them appear in your Dashboard.

Your Analytics tab will show:

Real-time cost tracking
Usage by user, model, and feature
Request volume and patterns

What’s next?

Now that you’re tracking requests, take these steps to unlock Tokenlay’s full potential:

Add metadata

Metadata enables powerful business-aligned controls by tracking user-specific data. Add metadata in one of three ways:

At client creation — Set default metadata for all requests
Per request — Add metadata to individual API calls
Automatically — Use integrations like Next.js middleware for seamless user tracking

Example: Track user tier and spending to create rules like “if user is in pro tier and has spent >$50/hour, then timeout for 10 minutes.”

Learn more about Passing Metadata.

Configure smart rules

Use the dashboard to create intelligent rules that match your business needs. Smart rules combine metadata, usage patterns, and costs to automatically control AI access.

Examples:

Route expensive requests from free users to cheaper models
Apply rate limits based on subscription tier
Block requests when monthly budget is exceeded

Set up your Smart Rules in the dashboard.

Handle response formats

Configure how your application responds when smart rules trigger. Tokenlay can return different response types:

Timeouts — Temporarily block users who exceed limits
Warnings — Alert users approaching their limits while allowing requests
Blocks — Hard stops for budget or safety violations
Redirects — Route to alternative models or cached responses

Learn about Response Formats and error management.

Introduction Advanced Configuration