The Practical Guide to Scaling Django
By Andrew on 11/10/2024
Most Django scaling guides focus on theoretical maximums. But real scaling isn’t about handling hypothetical millions of users - it’s about systematically eliminating bottlenecks as you grow. Here’s how to do it right, based on patterns that work in production.
Django is the framework of choice for many of the largest web applications (e.g. Instagram, Pinterest, etc.). But it’s not hard to get bogged down by common pitfalls.
First, Know Your Actual Bottlenecks
Before diving into solutions, understand that Django performance usually hits these bottlenecks in order:
- Database queries
- Template rendering
- Python processing
- Cache misses
- File I/O
- Network latency
Don’t optimize what isn’t slowing you down. Here’s how to tackle each when they become real problems:
Database Optimization
1. Query Optimization
# Bad: N+1 queries
for user in Users.objects.all():
print(user.profile.bio) # One query per user
# Good: Single query with select_related
users = User.objects.select_related('profile').all()
for user in users:
print(user.profile.bio) # No additional queries
2. Database Indexing
class Order(models.Model):
user = models.ForeignKey(User)
created_at = models.DateTimeField(auto_now_add=True)
status = models.CharField(max_length=20)
class Meta:
indexes = [
models.Index(fields=['created_at', 'status']),
models.Index(fields=['user', 'status']),
]
3. Queryset Optimization
# Bad: Loading entire objects
users = User.objects.all()
# Good: Only loading needed fields
users = User.objects.values('id', 'email')
# Better: Using iterator() for large querysets
for user in User.objects.iterator():
process_user(user)
Caching
1. View-Level Caching
from django.views.decorators.cache import cache_page
@cache_page(60 * 15) # Cache for 15 minutes
def product_list(request):
products = Product.objects.all()
return render(request, 'products/list.html', {'products': products})
2. Template Fragment Caching
{% load cache %}
{% cache 500 sidebar request.user.id %}
{% for item in expensive_query %}
{{ item }}
{% endfor %}
{% endcache %}
3. Low-Level Cache API
from django.core.cache import cache
def get_expensive_result(user_id):
cache_key = f'expensive_result_{user_id}'
result = cache.get(cache_key)
if result is None:
result = expensive_computation(user_id)
cache.set(cache_key, result, timeout=3600)
return result
Async: When You Need Concurrent Connections
# views.py
async def async_view(request):
async with aiohttp.ClientSession() as session:
async with session.get('http://api.example.com/data') as response:
data = await response.json()
return JsonResponse(data)
# urls.py
path('async-data/', async_view)
Background Tasks: Don’t Block the Request-Response Cycle
from django.core.mail import send_mail
from celery import shared_task
@shared_task
def send_welcome_email(user_id):
user = User.objects.get(id=user_id)
send_mail(
'Welcome!',
'Thanks for joining.',
'from@example.com',
[user.email],
)
# In your view
def signup(request):
user = User.objects.create_user(...)
send_welcome_email.delay(user.id)
return redirect('home')
Load Balancing: When Single Server Isn’t Enough
# settings.py for multiple servers
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'mydb',
'HOST': 'primary.database.host',
'CONN_MAX_AGE': 60,
},
'read_replica': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'mydb',
'HOST': 'replica.database.host',
'CONN_MAX_AGE': 60,
}
}
Media Files: Move to CDN Early
# settings.py
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'
STATICFILES_STORAGE = 'storages.backends.s3boto3.S3StaticStorage'
AWS_ACCESS_KEY_ID = 'your-access-key'
AWS_SECRET_ACCESS_KEY = 'your-secret-key'
AWS_STORAGE_BUCKET_NAME = 'your-bucket-name'
AWS_S3_CUSTOM_DOMAIN = f'{AWS_STORAGE_BUCKET_NAME}.s3.amazonaws.com'
Real-World Scaling Checkpoints
At 100 Requests/Second
- Implement basic caching
- Add database indexes
- Move static files to CDN
At 1,000 Requests/Second
- Add read replicas
- Implement fragment caching
- Move to managed Redis/Memcached
At 10,000 Requests/Second
- Shard databases
- Implement service-level caching
- Consider microservices for heavy operations
The Scaling Checklist
Before adding complexity, verify you’ve done these:
□ Optimized database queries (select_related, prefetch_related)
□ Added proper database indexes
□ Implemented view and template caching
□ Moved static/media files to CDN
□ Set up monitoring and alerting
□ Configured connection pooling
□ Implemented background tasks for heavy operations
□ Added read replicas for heavy read loads
□ Set up proper logging and error tracking
Remember: Django can handle more load than most people think when properly optimized. Start simple, measure everything, and scale what actually needs scaling.
The best scaling strategy isn’t adding more resources - it’s eliminating waste in your existing ones.
Build Faster