From 1dcd3809d30195492e5416fcf6a25bab2313d5f2 Mon Sep 17 00:00:00 2001 From: Andy Feller Date: Thu, 18 Jun 2026 11:42:53 -0400 Subject: [PATCH] Add usage and billing metrics docs page Document how SDK integrators read token counts, context-window utilization, AI credit cost, and account quota via session events and RPC methods (assistant.usage, session.usage_info, session.metadata.contextInfo, session.usage.getMetrics, models.list, account.getQuota). Examples are provided for TypeScript, Python, Go, .NET, Java, and Rust, and the compiled snippets pass the docs-validation harness. Adds a link to the features index. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/features/README.md | 1 + docs/features/usage-and-billing.md | 1347 ++++++++++++++++++++++++++++ 2 files changed, 1348 insertions(+) create mode 100644 docs/features/usage-and-billing.md diff --git a/docs/features/README.md b/docs/features/README.md index fe2a98c2e..b0fbc5d53 100644 --- a/docs/features/README.md +++ b/docs/features/README.md @@ -17,6 +17,7 @@ These guides cover the capabilities you can add to your Copilot SDK application. | [Plugin Directories](./plugin-directories.md) | Bundle skills, hooks, MCP servers, and agents as a single loadable plugin | | [Image Input](./image-input.md) | Send images to sessions as attachments | | [Streaming Events](./streaming-events.md) | Subscribe to real-time session events (40+ event types) | +| [Usage and Billing](./usage-and-billing.md) | Read token counts, context-window utilization, AI credit cost, and account quota | | [Steering & Queueing](./steering-and-queueing.md) | Control message delivery—immediate steering vs. sequential queueing | | [Session Persistence](./session-persistence.md) | Resume sessions across restarts, manage session storage | | [Remote Sessions](./remote-sessions.md) | Share locally hosted sessions to GitHub web and mobile via Mission Control | diff --git a/docs/features/usage-and-billing.md b/docs/features/usage-and-billing.md new file mode 100644 index 000000000..d97f5859c --- /dev/null +++ b/docs/features/usage-and-billing.md @@ -0,0 +1,1347 @@ +# Usage and billing metrics + +This guide shows how to read token counts, context-window utilization, AI credit cost, and account quota from a Copilot SDK application. Examples are shown for TypeScript, Python, Go, .NET, Java, and Rust. + +> [!TIP] +> Each example is functionally equivalent across languages. The TypeScript snippet is expanded by default; select your language from the collapsible blocks to see the same logic in that SDK. + +## Overview + +The SDK surfaces usage data through two complementary mechanisms: + +* **Session events**: ephemeral events the runtime emits as a turn runs. Subscribe to these for real-time, per-API-call data. +* **RPC methods**: request/response calls you make on demand. Use these to snapshot accumulated totals or look up account-level quota. + +The table below maps each signal to the API that exposes it. + +| Signal | API | Scope | Type | +|---|---|---|---| +| Per-call token counts | `assistant.usage` event | Session | Event | +| Context-window utilization | `session.usage_info` event | Session | Event | +| Context-window breakdown (on demand) | `session.metadata.contextInfo` | Session | RPC | +| Accumulated AI credit and token totals | `session.usage.getMetrics` | Session | RPC | +| Per-model AI credit pricing | `models.list` | Server | RPC | +| Account quota and premium interactions | `account.getQuota` | Server | RPC | + +> [!NOTE] +> `session.usage.getMetrics`, `session.metadata.contextInfo`, and `session.metadata.recomputeContextTokens` are marked experimental in the generated RPC surface. In .NET they raise the `GHCP001` experimental diagnostic, which you suppress with `#pragma warning disable GHCP001` or a project-level `GHCP001`. Pin both the SDK and the Copilot CLI runtime if your application depends on them. + +The field tables below list only the fields used in the examples on this page. The complete, always-current field reference is the generated SDK types plus [Streaming events](./streaming-events.md), which is regenerated from the CLI schema on every dependency bump. Treat those as the source of truth and this page as a task-oriented guide. + +## Per-call token counts + +The `assistant.usage` event is emitted once for every model API call in a turn (including calls made by sub-agents). It carries the token counts and the billing multiplier for that single call. + +The example below uses these fields. See [Streaming events](./streaming-events.md#assistantusage) for the full list, including cache, reasoning, latency, and tracing fields. + +| Field | Type | Description | +|---|---|---| +| `model` | `string` | Model identifier for this call | +| `inputTokens` | `number` | Input tokens consumed | +| `outputTokens` | `number` | Output tokens produced | +| `cost` | `number` | Premium request multiplier applied to this call | + +> [!TIP] +> `assistant.usage` is ephemeral, so it is delivered live but not replayed when you resume a session. To read accumulated totals after the fact, call `session.usage.getMetrics` (see [Accumulated AI credit and token totals](#accumulated-ai-credit-and-token-totals)). + +
+Node.js / TypeScript + + +```typescript +import { CopilotClient } from "@github/copilot-sdk"; + +const client = new CopilotClient(); +const session = await client.createSession({ streaming: true }); + +session.on("assistant.usage", (event) => { + const { model, inputTokens, outputTokens, cost } = event.data; + console.log( + `${model}: in=${inputTokens ?? 0} out=${outputTokens ?? 0} cost=${cost ?? 0}`, + ); +}); +``` + + +```typescript +session.on("assistant.usage", (event) => { + const { model, inputTokens, outputTokens, cost } = event.data; + console.log( + `${model}: in=${inputTokens ?? 0} out=${outputTokens ?? 0} cost=${cost ?? 0}`, + ); +}); +``` + +
+ +
+Python + + +```python +from copilot import CopilotClient +from copilot.session_events import SessionEventType + +client = CopilotClient() +session = await client.create_session(streaming=True) + +def on_usage(event): + if event.type == SessionEventType.ASSISTANT_USAGE: + data = event.data + print(f"{data.model}: in={data.input_tokens or 0} out={data.output_tokens or 0} cost={data.cost or 0}") + +session.on(on_usage) +``` + + +```python +def on_usage(event): + if event.type == SessionEventType.ASSISTANT_USAGE: + data = event.data + print(f"{data.model}: in={data.input_tokens or 0} out={data.output_tokens or 0} cost={data.cost or 0}") + +session.on(on_usage) +``` + +
+ +
+Go + + +```go +package main + +import ( + "context" + "fmt" + + copilot "github.com/github/copilot-sdk/go" + "github.com/github/copilot-sdk/go/rpc" +) + +func main() { + ctx := context.Background() + client := copilot.NewClient(nil) + client.Start(ctx) + + session, _ := client.CreateSession(ctx, &copilot.SessionConfig{ + Streaming: copilot.Bool(true), + OnPermissionRequest: func(req copilot.PermissionRequest, inv copilot.PermissionInvocation) (rpc.PermissionDecision, error) { + return &rpc.PermissionDecisionApproveOnce{}, nil + }, + }) + + session.On(func(event copilot.SessionEvent) { + d, ok := event.Data.(*copilot.AssistantUsageData) + if !ok { + return + } + in, out := int64(0), int64(0) + if d.InputTokens != nil { + in = *d.InputTokens + } + if d.OutputTokens != nil { + out = *d.OutputTokens + } + fmt.Printf("%s: in=%d out=%d\n", d.Model, in, out) + }) + _ = session +} +``` + + +```go +session.On(func(event copilot.SessionEvent) { + d, ok := event.Data.(*copilot.AssistantUsageData) + if !ok { + return + } + in, out := int64(0), int64(0) + if d.InputTokens != nil { + in = *d.InputTokens + } + if d.OutputTokens != nil { + out = *d.OutputTokens + } + fmt.Printf("%s: in=%d out=%d\n", d.Model, in, out) +}) +``` + +
+ +
+.NET + + +```csharp +using GitHub.Copilot; + +await using var client = new CopilotClient(); +await using var session = await client.CreateSessionAsync(new SessionConfig { Streaming = true }); + +session.On(evt => +{ + var data = evt.Data; + Console.WriteLine( + $"{data.Model}: in={data.InputTokens ?? 0} out={data.OutputTokens ?? 0} cost={data.Cost ?? 0}"); +}); +``` + + +```csharp +session.On(evt => +{ + var data = evt.Data; + Console.WriteLine( + $"{data.Model}: in={data.InputTokens ?? 0} out={data.OutputTokens ?? 0} cost={data.Cost ?? 0}"); +}); +``` + +
+ +
+Java + + +```java +session.on(AssistantUsageEvent.class, event -> { + var data = event.getData(); + long in = data.inputTokens() != null ? data.inputTokens() : 0; + long out = data.outputTokens() != null ? data.outputTokens() : 0; + System.out.printf("%s: in=%d out=%d%n", data.model(), in, out); +}); +``` + +
+ +
+Rust + +```rust +use github_copilot_sdk::session_events::AssistantUsageData; + +let mut events = session.subscribe(); +while let Ok(event) = events.recv().await { + if event.event_type == "assistant.usage" { + if let Some(data) = event.typed_data::() { + println!( + "{}: in={} out={}", + data.model, + data.input_tokens.unwrap_or(0), + data.output_tokens.unwrap_or(0), + ); + } + } +} +``` + +
+ +## Context-window utilization + +Token counts tell you what each call consumed. Context-window utilization tells you how full the model's prompt window is right now—useful for showing a progress bar or warning the user before automatic compaction kicks in. + +### Live updates with `session.usage_info` + +The runtime emits a `session.usage_info` event whenever the context-window size changes. The example uses `currentTokens` and `tokenLimit`; see [Streaming events](./streaming-events.md#sessionusage_info) for the complete payload. + +| Field | Type | Description | +|---|---|---| +| `currentTokens` | `number` | Tokens currently in the context window | +| `tokenLimit` | `number` | Maximum tokens for the model's context window | + +
+Node.js / TypeScript + + +```typescript +import { CopilotClient } from "@github/copilot-sdk"; + +const client = new CopilotClient(); +const session = await client.createSession({ streaming: true }); + +session.on("session.usage_info", (event) => { + const { currentTokens, tokenLimit } = event.data; + const pct = Math.round((currentTokens / tokenLimit) * 100); + console.log(`Context: ${currentTokens}/${tokenLimit} (${pct}%)`); +}); +``` + + +```typescript +session.on("session.usage_info", (event) => { + const { currentTokens, tokenLimit } = event.data; + const pct = Math.round((currentTokens / tokenLimit) * 100); + console.log(`Context: ${currentTokens}/${tokenLimit} (${pct}%)`); +}); +``` + +
+ +
+Python + + +```python +from copilot import CopilotClient +from copilot.session_events import SessionEventType + +client = CopilotClient() +session = await client.create_session(streaming=True) + +def on_usage_info(event): + if event.type == SessionEventType.SESSION_USAGE_INFO: + data = event.data + pct = round(data.current_tokens / data.token_limit * 100) + print(f"Context: {data.current_tokens}/{data.token_limit} ({pct}%)") + +session.on(on_usage_info) +``` + + +```python +def on_usage_info(event): + if event.type == SessionEventType.SESSION_USAGE_INFO: + data = event.data + pct = round(data.current_tokens / data.token_limit * 100) + print(f"Context: {data.current_tokens}/{data.token_limit} ({pct}%)") + +session.on(on_usage_info) +``` + +
+ +
+Go + + +```go +package main + +import ( + "context" + "fmt" + + copilot "github.com/github/copilot-sdk/go" + "github.com/github/copilot-sdk/go/rpc" +) + +func main() { + ctx := context.Background() + client := copilot.NewClient(nil) + client.Start(ctx) + + session, _ := client.CreateSession(ctx, &copilot.SessionConfig{ + Streaming: copilot.Bool(true), + OnPermissionRequest: func(req copilot.PermissionRequest, inv copilot.PermissionInvocation) (rpc.PermissionDecision, error) { + return &rpc.PermissionDecisionApproveOnce{}, nil + }, + }) + + session.On(func(event copilot.SessionEvent) { + d, ok := event.Data.(*copilot.SessionUsageInfoData) + if !ok { + return + } + pct := int(float64(d.CurrentTokens) / float64(d.TokenLimit) * 100) + fmt.Printf("Context: %d/%d (%d%%)\n", d.CurrentTokens, d.TokenLimit, pct) + }) + _ = session +} +``` + + +```go +session.On(func(event copilot.SessionEvent) { + d, ok := event.Data.(*copilot.SessionUsageInfoData) + if !ok { + return + } + pct := int(float64(d.CurrentTokens) / float64(d.TokenLimit) * 100) + fmt.Printf("Context: %d/%d (%d%%)\n", d.CurrentTokens, d.TokenLimit, pct) +}) +``` + +
+ +
+.NET + + +```csharp +using GitHub.Copilot; + +await using var client = new CopilotClient(); +await using var session = await client.CreateSessionAsync(new SessionConfig { Streaming = true }); + +session.On(evt => +{ + var pct = (int)Math.Round((double)evt.Data.CurrentTokens / evt.Data.TokenLimit * 100); + Console.WriteLine($"Context: {evt.Data.CurrentTokens}/{evt.Data.TokenLimit} ({pct}%)"); +}); +``` + + +```csharp +session.On(evt => +{ + var pct = (int)Math.Round((double)evt.Data.CurrentTokens / evt.Data.TokenLimit * 100); + Console.WriteLine($"Context: {evt.Data.CurrentTokens}/{evt.Data.TokenLimit} ({pct}%)"); +}); +``` + +
+ +
+Java + + +```java +session.on(SessionUsageInfoEvent.class, event -> { + var data = event.getData(); + long pct = Math.round((double) data.currentTokens() / data.tokenLimit() * 100); + System.out.printf("Context: %d/%d (%d%%)%n", data.currentTokens(), data.tokenLimit(), pct); +}); +``` + +
+ +
+Rust + +```rust +use github_copilot_sdk::session_events::SessionUsageInfoData; + +let mut events = session.subscribe(); +while let Ok(event) = events.recv().await { + if event.event_type == "session.usage_info" { + if let Some(data) = event.typed_data::() { + let pct = (data.current_tokens as f64 / data.token_limit as f64 * 100.0) as i64; + println!("Context: {}/{} ({}%)", data.current_tokens, data.token_limit, pct); + } + } +} +``` + +
+ +### On-demand breakdown with `session.metadata.contextInfo` + +Events only fire when the context changes. To read the current breakdown at any moment—for example, right after resuming a session—call `session.metadata.contextInfo`. Pass `0` for the token limits to use the model's defaults. + +The result's `contextInfo` is `null` until the session has been initialized (the system prompt and tool metadata have been cached). It breaks the total down into `systemTokens`, `conversationTokens`, and `toolDefinitionsTokens`, alongside the `promptTokenLimit`. + +
+Node.js / TypeScript + + +```typescript +import { CopilotClient } from "@github/copilot-sdk"; + +const client = new CopilotClient(); +const session = await client.createSession({}); + +const { contextInfo } = await session.rpc.metadata.contextInfo({ + promptTokenLimit: 0, + outputTokenLimit: 0, +}); + +if (contextInfo) { + console.log( + `Total ${contextInfo.totalTokens}/${contextInfo.promptTokenLimit} ` + + `(system=${contextInfo.systemTokens}, conversation=${contextInfo.conversationTokens})`, + ); +} +``` + + +```typescript +const { contextInfo } = await session.rpc.metadata.contextInfo({ + promptTokenLimit: 0, + outputTokenLimit: 0, +}); + +if (contextInfo) { + console.log( + `Total ${contextInfo.totalTokens}/${contextInfo.promptTokenLimit} ` + + `(system=${contextInfo.systemTokens}, conversation=${contextInfo.conversationTokens})`, + ); +} +``` + +
+ +
+Python + + +```python +from copilot import CopilotClient +from copilot.rpc import MetadataContextInfoRequest + +client = CopilotClient() +session = await client.create_session() + +result = await session.rpc.metadata.context_info( + MetadataContextInfoRequest(prompt_token_limit=0, output_token_limit=0) +) +info = result.context_info + +if info is not None: + print( + f"Total {info.total_tokens}/{info.prompt_token_limit} " + f"(system={info.system_tokens}, conversation={info.conversation_tokens})" + ) +``` + + +```python +result = await session.rpc.metadata.context_info( + MetadataContextInfoRequest(prompt_token_limit=0, output_token_limit=0) +) +info = result.context_info + +if info is not None: + print( + f"Total {info.total_tokens}/{info.prompt_token_limit} " + f"(system={info.system_tokens}, conversation={info.conversation_tokens})" + ) +``` + +
+ +
+Go + + +```go +package main + +import ( + "context" + "fmt" + + copilot "github.com/github/copilot-sdk/go" + "github.com/github/copilot-sdk/go/rpc" +) + +func main() { + ctx := context.Background() + client := copilot.NewClient(nil) + client.Start(ctx) + + session, _ := client.CreateSession(ctx, &copilot.SessionConfig{}) + + result, _ := session.RPC.Metadata.ContextInfo(ctx, &rpc.MetadataContextInfoRequest{ + PromptTokenLimit: 0, + OutputTokenLimit: 0, + }) + + if info := result.ContextInfo; info != nil { + fmt.Printf("Total %d/%d (system=%d, conversation=%d)\n", + info.TotalTokens, info.PromptTokenLimit, info.SystemTokens, info.ConversationTokens) + } +} +``` + + +```go +result, _ := session.RPC.Metadata.ContextInfo(ctx, &rpc.MetadataContextInfoRequest{ + PromptTokenLimit: 0, + OutputTokenLimit: 0, +}) + +if info := result.ContextInfo; info != nil { + fmt.Printf("Total %d/%d (system=%d, conversation=%d)\n", + info.TotalTokens, info.PromptTokenLimit, info.SystemTokens, info.ConversationTokens) +} +``` + +
+ +
+.NET + + +```csharp +#pragma warning disable GHCP001 +using GitHub.Copilot; + +await using var client = new CopilotClient(); +await using var session = await client.CreateSessionAsync(new SessionConfig()); + +var result = await session.Rpc.Metadata.ContextInfoAsync(promptTokenLimit: 0, outputTokenLimit: 0); +var info = result.ContextInfo; + +if (info is not null) +{ + Console.WriteLine( + $"Total {info.TotalTokens}/{info.PromptTokenLimit} " + + $"(system={info.SystemTokens}, conversation={info.ConversationTokens})"); +} +#pragma warning restore GHCP001 +``` + + +```csharp +var result = await session.Rpc.Metadata.ContextInfoAsync(promptTokenLimit: 0, outputTokenLimit: 0); +var info = result.ContextInfo; + +if (info is not null) +{ + Console.WriteLine( + $"Total {info.TotalTokens}/{info.PromptTokenLimit} " + + $"(system={info.SystemTokens}, conversation={info.ConversationTokens})"); +} +``` + +
+ +
+Java + + +```java +var result = session.getRpc().metadata + .contextInfo(new SessionMetadataContextInfoParams(null, 0L, 0L, null)) + .join(); +var info = result.contextInfo(); + +if (info != null) { + System.out.printf("Total %d/%d (system=%d, conversation=%d)%n", + info.totalTokens(), info.promptTokenLimit(), info.systemTokens(), info.conversationTokens()); +} +``` + +
+ +
+Rust + +```rust +use github_copilot_sdk::rpc::MetadataContextInfoRequest; + +let result = session + .rpc() + .metadata() + .context_info(MetadataContextInfoRequest { + prompt_token_limit: 0, + output_token_limit: 0, + selected_model: None, + }) + .await?; + +if let Some(info) = result.context_info { + println!( + "Total {}/{} (system={}, conversation={})", + info.total_tokens, info.prompt_token_limit, info.system_tokens, info.conversation_tokens, + ); +} +``` + +
+ +## Accumulated AI credit and token totals + +`session.usage.getMetrics` returns the running totals for the whole session in a single call. This is the cleanest way to read AI credit cost, because it aggregates every API call (main agent and sub-agents) for you. + +The example uses the fields below. The generated `UsageGetMetricsResult` type is the full reference. + +| Field | Type | Description | +|---|---|---| +| `totalNanoAiu` | `number` | Session-wide AI credit cost, in nano-AI units | +| `totalPremiumRequestCost` | `number` | Premium request cost across all models, after multipliers | +| `modelMetrics` | `Record` | Per-model breakdown; each entry has `usage.inputTokens`, `usage.outputTokens`, and `totalNanoAiu` | + +> [!NOTE] +> Cost is reported in **nano-AI units** (the field is named `totalNanoAiu`). The exact conversion to AI credits and the precise meaning of premium request accounting are defined by GitHub Copilot billing, not by the SDK—treat [GitHub's Copilot billing documentation](https://docs.github.com/en/copilot/managing-copilot/understanding-and-managing-copilot-usage) as the source of truth and verify before surfacing currency-like values to users. The examples divide by `1e9` as a convenience, following the SI `nano` prefix; confirm this matches current billing before relying on it. The `modelMetrics` and `tokenDetails` maps are keyed by runtime strings (model IDs and token-type names) that the SDK type system does not validate. + +
+Node.js / TypeScript + + +```typescript +import { CopilotClient } from "@github/copilot-sdk"; + +const client = new CopilotClient(); +const session = await client.createSession({}); + +const metrics = await session.rpc.usage.getMetrics(); + +const aiCredits = (metrics.totalNanoAiu ?? 0) / 1e9; +console.log(`AI credits used: ${aiCredits.toFixed(6)}`); +console.log(`Premium requests: ${metrics.totalPremiumRequestCost}`); + +for (const [model, m] of Object.entries(metrics.modelMetrics)) { + if (!m) continue; + console.log( + `${model}: in=${m.usage.inputTokens} out=${m.usage.outputTokens} ` + + `nanoAiu=${m.totalNanoAiu ?? 0}`, + ); +} +``` + + +```typescript +const metrics = await session.rpc.usage.getMetrics(); + +const aiCredits = (metrics.totalNanoAiu ?? 0) / 1e9; +console.log(`AI credits used: ${aiCredits.toFixed(6)}`); +console.log(`Premium requests: ${metrics.totalPremiumRequestCost}`); + +for (const [model, m] of Object.entries(metrics.modelMetrics)) { + if (!m) continue; + console.log( + `${model}: in=${m.usage.inputTokens} out=${m.usage.outputTokens} ` + + `nanoAiu=${m.totalNanoAiu ?? 0}`, + ); +} +``` + +
+ +
+Python + + +```python +from copilot import CopilotClient + +client = CopilotClient() +session = await client.create_session() + +metrics = await session.rpc.usage.get_metrics() + +ai_credits = (metrics.total_nano_aiu or 0) / 1e9 +print(f"AI credits used: {ai_credits:.6f}") +print(f"Premium requests: {metrics.total_premium_request_cost}") + +for model, m in metrics.model_metrics.items(): + print(f"{model}: in={m.usage.input_tokens} out={m.usage.output_tokens} nanoAiu={m.total_nano_aiu or 0}") +``` + + +```python +metrics = await session.rpc.usage.get_metrics() + +ai_credits = (metrics.total_nano_aiu or 0) / 1e9 +print(f"AI credits used: {ai_credits:.6f}") +print(f"Premium requests: {metrics.total_premium_request_cost}") + +for model, m in metrics.model_metrics.items(): + print(f"{model}: in={m.usage.input_tokens} out={m.usage.output_tokens} nanoAiu={m.total_nano_aiu or 0}") +``` + +
+ +
+Go + + +```go +package main + +import ( + "context" + "fmt" + + copilot "github.com/github/copilot-sdk/go" +) + +func main() { + ctx := context.Background() + client := copilot.NewClient(nil) + client.Start(ctx) + + session, _ := client.CreateSession(ctx, &copilot.SessionConfig{}) + + metrics, _ := session.RPC.Usage.GetMetrics(ctx) + + aiCredits := float64(0) + if metrics.TotalNanoAiu != nil { + aiCredits = *metrics.TotalNanoAiu / 1e9 + } + fmt.Printf("AI credits used: %.6f\n", aiCredits) + fmt.Printf("Premium requests: %v\n", metrics.TotalPremiumRequestCost) + + for model, m := range metrics.ModelMetrics { + nanoAiu := float64(0) + if m.TotalNanoAiu != nil { + nanoAiu = *m.TotalNanoAiu + } + fmt.Printf("%s: in=%d out=%d nanoAiu=%v\n", model, m.Usage.InputTokens, m.Usage.OutputTokens, nanoAiu) + } +} +``` + + +```go +metrics, _ := session.RPC.Usage.GetMetrics(ctx) + +aiCredits := float64(0) +if metrics.TotalNanoAiu != nil { + aiCredits = *metrics.TotalNanoAiu / 1e9 +} +fmt.Printf("AI credits used: %.6f\n", aiCredits) +fmt.Printf("Premium requests: %v\n", metrics.TotalPremiumRequestCost) + +for model, m := range metrics.ModelMetrics { + nanoAiu := float64(0) + if m.TotalNanoAiu != nil { + nanoAiu = *m.TotalNanoAiu + } + fmt.Printf("%s: in=%d out=%d nanoAiu=%v\n", model, m.Usage.InputTokens, m.Usage.OutputTokens, nanoAiu) +} +``` + +
+ +
+.NET + + +```csharp +#pragma warning disable GHCP001 +using GitHub.Copilot; + +await using var client = new CopilotClient(); +await using var session = await client.CreateSessionAsync(new SessionConfig()); + +var metrics = await session.Rpc.Usage.GetMetricsAsync(); + +var aiCredits = (metrics.TotalNanoAiu ?? 0) / 1e9; +Console.WriteLine($"AI credits used: {aiCredits:F6}"); +Console.WriteLine($"Premium requests: {metrics.TotalPremiumRequestCost}"); + +foreach (var (model, m) in metrics.ModelMetrics) +{ + Console.WriteLine( + $"{model}: in={m.Usage.InputTokens} out={m.Usage.OutputTokens} nanoAiu={m.TotalNanoAiu ?? 0}"); +} +#pragma warning restore GHCP001 +``` + + +```csharp +var metrics = await session.Rpc.Usage.GetMetricsAsync(); + +var aiCredits = (metrics.TotalNanoAiu ?? 0) / 1e9; +Console.WriteLine($"AI credits used: {aiCredits:F6}"); +Console.WriteLine($"Premium requests: {metrics.TotalPremiumRequestCost}"); + +foreach (var (model, m) in metrics.ModelMetrics) +{ + Console.WriteLine( + $"{model}: in={m.Usage.InputTokens} out={m.Usage.OutputTokens} nanoAiu={m.TotalNanoAiu ?? 0}"); +} +``` + +
+ +
+Java + + +```java +var metrics = session.getRpc().usage.getMetrics().join(); + +double aiCredits = metrics.totalNanoAiu() != null ? metrics.totalNanoAiu() / 1e9 : 0; +System.out.printf("AI credits used: %.6f%n", aiCredits); +System.out.printf("Premium requests: %s%n", metrics.totalPremiumRequestCost()); + +metrics.modelMetrics().forEach((model, m) -> { + double nanoAiu = m.totalNanoAiu() != null ? m.totalNanoAiu() : 0; + System.out.printf("%s: in=%d out=%d nanoAiu=%s%n", + model, m.usage().inputTokens(), m.usage().outputTokens(), nanoAiu); +}); +``` + +
+ +
+Rust + +```rust +let metrics = session.rpc().usage().get_metrics().await?; + +let ai_credits = metrics.total_nano_aiu.unwrap_or(0.0) / 1e9; +println!("AI credits used: {ai_credits:.6}"); +println!("Premium requests: {}", metrics.total_premium_request_cost); + +for (model, m) in &metrics.model_metrics { + let nano_aiu = m.total_nano_aiu.unwrap_or(0.0); + println!( + "{model}: in={} out={} nanoAiu={nano_aiu}", + m.usage.input_tokens, m.usage.output_tokens, + ); +} +``` + +
+ +## Per-model AI credit pricing + +To estimate cost before you run a turn, read each model's token prices from `models.list`. This is a server-scoped call on the client, so it does not need a session. Prices are expressed in AI credits per billing batch of tokens. The generated `ModelBillingTokenPrices` type lists every field, including `cachePrice`. + +| Field | Type | Description | +|---|---|---| +| `billing.multiplier` | `number` | Premium request cost multiplier relative to the base rate | +| `billing.tokenPrices.inputPrice` | `number` | AI credit cost per batch of input tokens | +| `billing.tokenPrices.outputPrice` | `number` | AI credit cost per batch of output tokens | +| `billing.tokenPrices.batchSize` | `number` | Number of tokens per billing batch | + +> [!NOTE] +> Price values change as plans and models evolve. Read them at runtime as shown below; never hard-code the numbers into your application. + +
+Node.js / TypeScript + + +```typescript +import { CopilotClient } from "@github/copilot-sdk"; + +const client = new CopilotClient(); + +const { models } = await client.rpc.models.list({}); + +for (const model of models) { + const prices = model.billing?.tokenPrices; + if (!prices) continue; + console.log( + `${model.id}: input=${prices.inputPrice} output=${prices.outputPrice} ` + + `per ${prices.batchSize} tokens (x${model.billing?.multiplier ?? 1})`, + ); +} +``` + + +```typescript +const { models } = await client.rpc.models.list({}); + +for (const model of models) { + const prices = model.billing?.tokenPrices; + if (!prices) continue; + console.log( + `${model.id}: input=${prices.inputPrice} output=${prices.outputPrice} ` + + `per ${prices.batchSize} tokens (x${model.billing?.multiplier ?? 1})`, + ); +} +``` + +
+ +
+Python + + +```python +from copilot import CopilotClient +from copilot.rpc import ModelsListRequest + +client = CopilotClient() + +result = await client.rpc.models.list(ModelsListRequest()) + +for model in result.models: + prices = model.billing.token_prices if model.billing else None + if prices is None: + continue + multiplier = model.billing.multiplier if model.billing else 1 + print( + f"{model.id}: input={prices.input_price} output={prices.output_price} " + f"per {prices.batch_size} tokens (x{multiplier})" + ) +``` + + +```python +result = await client.rpc.models.list(ModelsListRequest()) + +for model in result.models: + prices = model.billing.token_prices if model.billing else None + if prices is None: + continue + multiplier = model.billing.multiplier if model.billing else 1 + print( + f"{model.id}: input={prices.input_price} output={prices.output_price} " + f"per {prices.batch_size} tokens (x{multiplier})" + ) +``` + +
+ +
+Go + + +```go +package main + +import ( + "context" + "fmt" + + copilot "github.com/github/copilot-sdk/go" + "github.com/github/copilot-sdk/go/rpc" +) + +func main() { + ctx := context.Background() + client := copilot.NewClient(nil) + client.Start(ctx) + + list, _ := client.RPC.Models.List(ctx, &rpc.ModelsListRequest{}) + + for _, model := range list.Models { + if model.Billing == nil || model.Billing.TokenPrices == nil { + continue + } + prices := model.Billing.TokenPrices + multiplier := 1.0 + if model.Billing.Multiplier != nil { + multiplier = *model.Billing.Multiplier + } + in, out := 0.0, 0.0 + if prices.InputPrice != nil { + in = *prices.InputPrice + } + if prices.OutputPrice != nil { + out = *prices.OutputPrice + } + batch := int64(0) + if prices.BatchSize != nil { + batch = *prices.BatchSize + } + fmt.Printf("%s: input=%v output=%v per %d tokens (x%v)\n", model.ID, in, out, batch, multiplier) + } +} +``` + + +```go +list, _ := client.RPC.Models.List(ctx, &rpc.ModelsListRequest{}) + +for _, model := range list.Models { + if model.Billing == nil || model.Billing.TokenPrices == nil { + continue + } + prices := model.Billing.TokenPrices + multiplier := 1.0 + if model.Billing.Multiplier != nil { + multiplier = *model.Billing.Multiplier + } + in, out := 0.0, 0.0 + if prices.InputPrice != nil { + in = *prices.InputPrice + } + if prices.OutputPrice != nil { + out = *prices.OutputPrice + } + batch := int64(0) + if prices.BatchSize != nil { + batch = *prices.BatchSize + } + fmt.Printf("%s: input=%v output=%v per %d tokens (x%v)\n", model.ID, in, out, batch, multiplier) +} +``` + +
+ +
+.NET + + +```csharp +using GitHub.Copilot; + +await using var client = new CopilotClient(); + +var list = await client.Rpc.Models.ListAsync(); + +foreach (var model in list.Models) +{ + var prices = model.Billing?.TokenPrices; + if (prices is null) continue; + Console.WriteLine( + $"{model.Id}: input={prices.InputPrice} output={prices.OutputPrice} " + + $"per {prices.BatchSize} tokens (x{model.Billing?.Multiplier ?? 1})"); +} +``` + + +```csharp +var list = await client.Rpc.Models.ListAsync(); + +foreach (var model in list.Models) +{ + var prices = model.Billing?.TokenPrices; + if (prices is null) continue; + Console.WriteLine( + $"{model.Id}: input={prices.InputPrice} output={prices.OutputPrice} " + + $"per {prices.BatchSize} tokens (x{model.Billing?.Multiplier ?? 1})"); +} +``` + +
+ +
+Java + + +```java +var list = client.getRpc().models.list().join(); + +for (var model : list.models()) { + var billing = model.billing(); + if (billing == null || billing.tokenPrices() == null) { + continue; + } + var prices = billing.tokenPrices(); + double multiplier = billing.multiplier() != null ? billing.multiplier() : 1; + System.out.printf("%s: input=%s output=%s per %d tokens (x%s)%n", + model.id(), prices.inputPrice(), prices.outputPrice(), prices.batchSize(), multiplier); +} +``` + +
+ +
+Rust + +```rust +let list = client.rpc().models().list().await?; + +for model in &list.models { + let Some(billing) = &model.billing else { continue }; + let Some(prices) = &billing.token_prices else { continue }; + let multiplier = billing.multiplier.unwrap_or(1.0); + println!( + "{}: input={} output={} per {} tokens (x{multiplier})", + model.id, + prices.input_price.unwrap_or(0.0), + prices.output_price.unwrap_or(0.0), + prices.batch_size.unwrap_or(0), + ); +} +``` + +
+ +## Account quota and premium interactions + +`account.getQuota` reports the authenticated user's remaining Copilot entitlement. The result's `quotaSnapshots` map is keyed by quota type—commonly `premium_interactions`, `chat`, and `completions`. Use it to show users how much of their monthly allowance is left, or to gate work before they hit a limit. + +The example uses the fields below; the generated `AccountQuotaSnapshot` type is the full reference. The `quotaSnapshots` keys are runtime strings that the SDK type system does not validate, so guard your lookups. + +| Field | Type | Description | +|---|---|---| +| `entitlementRequests` | `number` | Requests included in the entitlement, or `-1` for unlimited | +| `usedRequests` | `number` | Requests used so far this period | +| `remainingPercentage` | `number` | Percentage of the entitlement remaining | +| `resetDate` | `string` | ISO 8601 date when the quota resets | + +> [!TIP] +> To read quota for a specific user rather than the connection's global auth context (for example, in a multi-tenant backend), pass that user's GitHub token to `getQuota`. See [Multi-tenancy](../setup/multi-tenancy.md). + +
+Node.js / TypeScript + + +```typescript +import { CopilotClient } from "@github/copilot-sdk"; + +const client = new CopilotClient(); + +const { quotaSnapshots } = await client.rpc.account.getQuota({}); +const premium = quotaSnapshots["premium_interactions"]; + +if (premium) { + console.log( + `Premium interactions: ${premium.usedRequests}/${premium.entitlementRequests} ` + + `(${premium.remainingPercentage.toFixed(1)}% left, resets ${premium.resetDate ?? "n/a"})`, + ); +} +``` + + +```typescript +const { quotaSnapshots } = await client.rpc.account.getQuota({}); +const premium = quotaSnapshots["premium_interactions"]; + +if (premium) { + console.log( + `Premium interactions: ${premium.usedRequests}/${premium.entitlementRequests} ` + + `(${premium.remainingPercentage.toFixed(1)}% left, resets ${premium.resetDate ?? "n/a"})`, + ); +} +``` + +
+ +
+Python + + +```python +from copilot import CopilotClient +from copilot.rpc import AccountGetQuotaRequest + +client = CopilotClient() + +result = await client.rpc.account.get_quota(AccountGetQuotaRequest()) +premium = result.quota_snapshots.get("premium_interactions") + +if premium is not None: + print( + f"Premium interactions: {premium.used_requests}/{premium.entitlement_requests} " + f"({premium.remaining_percentage:.1f}% left, resets {premium.reset_date or 'n/a'})" + ) +``` + + +```python +result = await client.rpc.account.get_quota(AccountGetQuotaRequest()) +premium = result.quota_snapshots.get("premium_interactions") + +if premium is not None: + print( + f"Premium interactions: {premium.used_requests}/{premium.entitlement_requests} " + f"({premium.remaining_percentage:.1f}% left, resets {premium.reset_date or 'n/a'})" + ) +``` + +
+ +
+Go + + +```go +package main + +import ( + "context" + "fmt" + "time" + + copilot "github.com/github/copilot-sdk/go" + "github.com/github/copilot-sdk/go/rpc" +) + +func main() { + ctx := context.Background() + client := copilot.NewClient(nil) + client.Start(ctx) + + result, _ := client.RPC.Account.GetQuota(ctx, &rpc.AccountGetQuotaRequest{}) + + if premium, ok := result.QuotaSnapshots["premium_interactions"]; ok { + resets := "n/a" + if premium.ResetDate != nil { + resets = premium.ResetDate.Format(time.RFC3339) + } + fmt.Printf("Premium interactions: %d/%d (%.1f%% left, resets %s)\n", + premium.UsedRequests, premium.EntitlementRequests, premium.RemainingPercentage, resets) + } +} +``` + + +```go +result, _ := client.RPC.Account.GetQuota(ctx, &rpc.AccountGetQuotaRequest{}) + +if premium, ok := result.QuotaSnapshots["premium_interactions"]; ok { + resets := "n/a" + if premium.ResetDate != nil { + resets = premium.ResetDate.Format(time.RFC3339) + } + fmt.Printf("Premium interactions: %d/%d (%.1f%% left, resets %s)\n", + premium.UsedRequests, premium.EntitlementRequests, premium.RemainingPercentage, resets) +} +``` + +
+ +
+.NET + + +```csharp +using GitHub.Copilot; + +await using var client = new CopilotClient(); + +var result = await client.Rpc.Account.GetQuotaAsync(); + +if (result.QuotaSnapshots.TryGetValue("premium_interactions", out var premium)) +{ + Console.WriteLine( + $"Premium interactions: {premium.UsedRequests}/{premium.EntitlementRequests} " + + $"({premium.RemainingPercentage:F1}% left, resets {premium.ResetDate?.ToString("o") ?? "n/a"})"); +} +``` + + +```csharp +var result = await client.Rpc.Account.GetQuotaAsync(); + +if (result.QuotaSnapshots.TryGetValue("premium_interactions", out var premium)) +{ + Console.WriteLine( + $"Premium interactions: {premium.UsedRequests}/{premium.EntitlementRequests} " + + $"({premium.RemainingPercentage:F1}% left, resets {premium.ResetDate?.ToString("o") ?? "n/a"})"); +} +``` + +
+ +
+Java + + +```java +var result = client.getRpc().account.getQuota().join(); +var premium = result.quotaSnapshots().get("premium_interactions"); + +if (premium != null) { + System.out.printf("Premium interactions: %d/%d (%.1f%% left, resets %s)%n", + premium.usedRequests(), premium.entitlementRequests(), + premium.remainingPercentage(), premium.resetDate()); +} +``` + +
+ +
+Rust + +```rust +let result = client.rpc().account().get_quota().await?; + +if let Some(premium) = result.quota_snapshots.get("premium_interactions") { + let resets = premium.reset_date.as_deref().unwrap_or("n/a"); + println!( + "Premium interactions: {}/{} ({:.1}% left, resets {resets})", + premium.used_requests, premium.entitlement_requests, premium.remaining_percentage, + ); +} +``` + +
+ +## Choosing the right API + +Use this summary to decide which API fits your use case: + +* **Render a live cost or token meter as a turn runs**: subscribe to `assistant.usage` and `session.usage_info`. +* **Show a final cost summary after a turn or session**: call `session.usage.getMetrics`. +* **Display context-window usage on resume, before any new turn**: call `session.metadata.contextInfo`. +* **Estimate cost before running work**: read `models.list` token prices. +* **Warn users before they exhaust their plan**: call `account.getQuota`. + +## Further reading + +* [Streaming events](./streaming-events.md): full field-level reference for `assistant.usage`, `session.usage_info`, and every other session event +* [Observability](../observability/README.md): export usage data to OpenTelemetry for cost attribution +* [Multi-tenancy](../setup/multi-tenancy.md): resolve per-user quota and models with a GitHub token