WEYL
← Back to blog

API Design Philosophy: Why We Chose gRPC + OpenAPI

Jordan Kim
api design grpc openapi

API Design Philosophy: Why We Chose gRPC + OpenAPI

When building Weyl’s API, we faced a crucial decision: REST, gRPC, GraphQL, or something else? Here’s why we chose a hybrid approach.

The Requirements

Our API needed to satisfy multiple, sometimes conflicting constraints:

Performance Requirements

Developer Experience

Operational Requirements

The Hybrid Solution

We expose three API surfaces:

1. REST/JSON (OpenAPI 3.1)

For exploration and simple integrations:

Terminal window
# Works in any terminal
curl -X POST https://api.weyl.ai/v1/generate \
-H "Authorization: Bearer $API_KEY" \
-d '{"prompt": "a sunset", "model": "sdxl"}'

Benefits:

Trade-offs:

2. gRPC

For production workloads:

service Weyl {
rpc Generate(GenerateRequest) returns (GenerateResponse);
rpc GenerateStream(GenerateRequest) returns (stream Frame);
}

Benefits:

Trade-offs:

3. WebSocket

For interactive applications:

const ws = new WebSocket('wss://api.weyl.ai/v1/ws');
ws.send(JSON.stringify({ type: 'generate', ... }));

Benefits:

Trade-offs:

Design Principles

Principle 1: Consistent Errors

All errors follow RFC 7807 (Problem Details):

{
"type": "https://api.weyl.ai/errors/rate-limit",
"title": "Rate Limit Exceeded",
"status": 429,
"detail": "Tier allows 100 req/min, you sent 150",
"instance": "/v1/generate",
"retry_after": 30
}

This works identically across REST, gRPC (via status details), and WebSocket.

Principle 2: Idempotency Keys

All mutation endpoints accept an Idempotency-Key header:

Terminal window
curl -X POST https://api.weyl.ai/v1/generate \
-H "Idempotency-Key: unique-id-12345" \
...

Why it matters:

Implementation: We cache responses for 24 hours keyed by (user_id, idempotency_key).

Principle 3: Versioning via Accept Header

Terminal window
# Request v1
curl -H "Accept: application/vnd.weyl.v1+json" ...
# Request v2 (future)
curl -H "Accept: application/vnd.weyl.v2+json" ...

Why not /v1/ in path?

Principle 4: Pagination as Streams

Instead of page and limit, we use cursor-based pagination:

{
"data": [...],
"cursor": "eyJpZCI6MTIzLCJ0cyI6MTYzODM2MDAwMH0"
}

Next page:

Terminal window
curl "https://api.weyl.ai/v1/jobs?cursor=eyJ..."

Benefits:

Client Libraries

We auto-generate clients from our specs:

REST (OpenAPI)

Terminal window
# TypeScript
bun add @weyl/client
# Python
pip install weyl
# Go
go get github.com/weyl-ai/weyl-go

All generated from openapi.yaml using openapi-generator.

gRPC (Protobuf)

Terminal window
# Install buf
buf generate

This generates clients in 10+ languages from our .proto files.

OpenAPI Specification Highlights

Our openapi.yaml is ~2000 lines and includes:

View it: api.weyl.ai/openapi.yaml

Performance Results

Comparing REST vs gRPC for the same workload (1000 generate calls):

MetricREST/JSONgRPCImprovement
Total time127s94s1.35x faster
Avg latency127ms94ms35% reduction
P99 latency310ms198ms56% reduction
Bandwidth2.1 MB0.8 MB2.6x smaller

gRPC wins on performance, but REST is good enough for most use cases.

Lessons Learned

What Worked Well

  1. Hybrid approach: Use the right tool for the job
  2. OpenAPI-first: Generate docs, tests, and mocks from spec
  3. Conservative versioning: We haven’t needed v2 yet

What We’d Do Differently

  1. Avoid nested resources: /users/{id}/jobs/{id} is verbose and rigid
  2. More webhooks: Would reduce polling load
  3. JSON Schema: More validation = fewer runtime errors

Try It

Explore our API interactively:


Explore our API: api.weyl.ai