Building a High-Performance Serverless URL Shortener

Introduction

URL shorteners are the “Hello World” of system design, but building one that is production-ready, scalable, and extremely fast is a different challenge. I recently built a serverless URL redirection service on Azure that consistently delivers sub-50ms response times.

The Stack

Compute: Azure Functions (Python 3.11)
Database: Azure Cosmos DB (NoSQL)
Monitoring: Application Insights

sequenceDiagram
    participant User
    participant Function as Azure Function
    participant Cosmos as Cosmos DB
    participant Dest as Destination URL
    
    User->>Function: GET /short-slug
    Function->>Cosmos: Point Read (slug as ID & partition key)
    Cosmos-->>Function: Document with original URL
    Function-->>User: 302 Redirect
    User->>Dest: Follow redirect
    
    Note over Function,Cosmos: Sub-50ms average response time

Optimizing for Speed

1. Point Reads vs. Queries

The single biggest performance win came from how I query the database. Instead of running a SQL-like query (SELECT * FROM c WHERE c.slug = 'xyz'), I structured the data to use Point Reads.

By using the slug as both the document ID and the partition key, I can fetch the document directly by its reference. This bypasses the query engine entirely and is blazing fast.

# The slow way (Query)
# items = container.query_items(query="SELECT * FROM c WHERE c.slug = @slug", ...)

# The fast way (Point Read)
item = container.read_item(item=slug, partition_key=slug)

2. Connection Reuse

In serverless environments, establishing database connections is expensive. I implemented a global singleton pattern for the Cosmos DB client. This ensures that subsequent invocations of the function reuse the existing TCP connection, shaving hundreds of milliseconds off the latency.

3. Lean Payload

I optimized the document structure to store only what’s necessary: the original URL, creation date, and expiry. This keeps the payload small and network transfer times negligible.

Resilience and Monitoring

I didn’t just build for speed; I built for reliability.

Structured Logging: Every request logs detailed metrics (response time, DB latency) as JSON, making it easy to query in Application Insights.
Health Checks: A dedicated /health endpoint monitors the connectivity to Cosmos DB, allowing our load balancers to route traffic away if the service degrades.

Conclusion

Serverless doesn’t have to mean “cold starts and slow responses.” By understanding the underlying mechanics of the database and the runtime, I built a service that is both cost-effective and incredibly performant.