The Practical Guide to Scaling Django

By Andrew on 11/10/2024

Most Django scaling guides focus on theoretical maximums. But real scaling isn’t about handling hypothetical millions of users - it’s about systematically eliminating bottlenecks as you grow. Here’s how to do it right, based on patterns that work in production.

Django is the framework of choice for many of the largest web applications (e.g. Instagram, Pinterest, etc.). But it’s not hard to get bogged down by common pitfalls.

First, Know Your Actual Bottlenecks

Before diving into solutions, understand that Django performance usually hits these bottlenecks in order:

Database queries
Template rendering
Python processing
Cache misses
File I/O
Network latency

Don’t optimize what isn’t slowing you down. Here’s how to tackle each when they become real problems:

Database Optimization

1. Query Optimization

# Bad: N+1 queries
for user in Users.objects.all():
    print(user.profile.bio)  # One query per user

# Good: Single query with select_related
users = User.objects.select_related('profile').all()
for user in users:
    print(user.profile.bio)  # No additional queries

2. Database Indexing

class Order(models.Model):
    user = models.ForeignKey(User)
    created_at = models.DateTimeField(auto_now_add=True)
    status = models.CharField(max_length=20)

    class Meta:
        indexes = [
            models.Index(fields=['created_at', 'status']),
            models.Index(fields=['user', 'status']),
        ]

3. Queryset Optimization

# Bad: Loading entire objects
users = User.objects.all()

# Good: Only loading needed fields
users = User.objects.values('id', 'email')

# Better: Using iterator() for large querysets
for user in User.objects.iterator():
    process_user(user)

Caching

1. View-Level Caching

from django.views.decorators.cache import cache_page

@cache_page(60 * 15)  # Cache for 15 minutes
def product_list(request):
    products = Product.objects.all()
    return render(request, 'products/list.html', {'products': products})

2. Template Fragment Caching

{% load cache %}

{% cache 500 sidebar request.user.id %}
    {% for item in expensive_query %}
        {{ item }}
    {% endfor %}
{% endcache %}

3. Low-Level Cache API

from django.core.cache import cache

def get_expensive_result(user_id):
    cache_key = f'expensive_result_{user_id}'
    result = cache.get(cache_key)
    
    if result is None:
        result = expensive_computation(user_id)
        cache.set(cache_key, result, timeout=3600)
    
    return result

Async: When You Need Concurrent Connections

# views.py
async def async_view(request):
    async with aiohttp.ClientSession() as session:
        async with session.get('http://api.example.com/data') as response:
            data = await response.json()
    return JsonResponse(data)

# urls.py
path('async-data/', async_view)

Background Tasks: Don’t Block the Request-Response Cycle

from django.core.mail import send_mail
from celery import shared_task

@shared_task
def send_welcome_email(user_id):
    user = User.objects.get(id=user_id)
    send_mail(
        'Welcome!',
        'Thanks for joining.',
        'from@example.com',
        [user.email],
    )

# In your view
def signup(request):
    user = User.objects.create_user(...)
    send_welcome_email.delay(user.id)
    return redirect('home')

Load Balancing: When Single Server Isn’t Enough

# settings.py for multiple servers
DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'mydb',
        'HOST': 'primary.database.host',
        'CONN_MAX_AGE': 60,
    },
    'read_replica': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'mydb',
        'HOST': 'replica.database.host',
        'CONN_MAX_AGE': 60,
    }
}

Media Files: Move to CDN Early

# settings.py
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'
STATICFILES_STORAGE = 'storages.backends.s3boto3.S3StaticStorage'

AWS_ACCESS_KEY_ID = 'your-access-key'
AWS_SECRET_ACCESS_KEY = 'your-secret-key'
AWS_STORAGE_BUCKET_NAME = 'your-bucket-name'
AWS_S3_CUSTOM_DOMAIN = f'{AWS_STORAGE_BUCKET_NAME}.s3.amazonaws.com'

Real-World Scaling Checkpoints

At 100 Requests/Second

Implement basic caching
Add database indexes
Move static files to CDN

At 1,000 Requests/Second

Add read replicas
Implement fragment caching
Move to managed Redis/Memcached

At 10,000 Requests/Second

Shard databases
Implement service-level caching
Consider microservices for heavy operations

The Scaling Checklist

Before adding complexity, verify you’ve done these:

□ Optimized database queries (select_related, prefetch_related)

□ Added proper database indexes

□ Implemented view and template caching

□ Moved static/media files to CDN

□ Set up monitoring and alerting

□ Configured connection pooling

□ Implemented background tasks for heavy operations

□ Added read replicas for heavy read loads

□ Set up proper logging and error tracking

Remember: Django can handle more load than most people think when properly optimized. Start simple, measure everything, and scale what actually needs scaling.

The best scaling strategy isn’t adding more resources - it’s eliminating waste in your existing ones.

Build Faster