Python Developer Interview Questions

200 scenario-based questions with detailed model answers, organized skill-wise and tool-wise. Filter by topic, level or keyword, reveal the answer — then pressure-test yourself in a real mock.

SKILL / TOOL

LEVEL

200 questions

Q001Core Python & Data ModelMid

Your team's e-commerce service uses a custom `Product` class. A junior dev added `__eq__` to compare by SKU, but now products disappear from sets after mutation. The bug is reproducible. Walk through the root cause and the fix.

Q002Core Python & Data ModelSenior

A fintech team's `Decimal`-heavy pricing engine is leaking memory under sustained load. A profiler shows thousands of live `PriceContext` instances. The class uses `__slots__`, yet memory grows. Where do you look first and why?

Q003Async & Concurrency (asyncio/GIL)Mid

A logistics startup's FastAPI service fetches ETAs from three third-party carrier APIs before responding. Under load, p99 latency is 4 seconds because the calls run sequentially. You have 2 hours to fix it without changing the carrier SDKs.

Q004Async & Concurrency (asyncio/GIL)Senior

Your Python data pipeline uses multiprocessing to bypass the GIL for CPU-bound transforms. In production, workers occasionally deadlock after a fork. The deadlock traces always show a `threading.Lock` inside the logging module. Explain the mechanism and your remediation.

Q005DjangoMid

A healthcare SaaS app's admin sees a Django debug toolbar query panel showing 847 queries for a single patient detail page. The page loads patient records with related appointments, prescriptions, and lab results. Fix this without restructuring the data model.

Q006DjangoSenior

A multi-tenant SaaS built on Django serves 400 tenants on a shared PostgreSQL schema. Tenant data isolation is enforced by filtering on `tenant_id` in every view. A security audit finds two endpoints missing the filter. Propose a systemic fix that prevents this class of bug permanently.

Q007FastAPI/FlaskMid

Your Flask API returns stale user profile data intermittently. Investigation shows that some requests hit a replica with replication lag. The DB router directs reads to replicas and writes to primary. Users see their own just-submitted changes missing for 5–10 seconds.

Q008FastAPI/FlaskSenior

A FastAPI service handles webhook payloads from a payment processor. Under sustained load at 2,000 req/s, you observe periodic 503s. Profiling shows the event loop is blocked for 200ms+ during payload validation. The Pydantic models are complex with nested validators.

Q009ORM & SQLAlchemyMid

A reporting service using SQLAlchemy Core runs a nightly aggregation query that times out after 30 minutes on a table with 200M rows. The query does a full table scan with a GROUP BY. The DBA says indexes won't help here. What SQLAlchemy-level changes do you make?

Q010ORM & SQLAlchemySenior

Your team migrated from SQLAlchemy 1.4 to 2.0. In production, connection pool exhaustion occurs every few days, requiring a service restart. The pool is sized at 20 with overflow 10. The 1.4 codebase never had this issue with identical config.

Q011REST API DesignMid

You're designing a bulk-import endpoint for an HR system. Clients POST CSV files with up to 50,000 employee records. The current synchronous implementation times out for files over 1,000 rows. Design the API contract for the async pattern.

Q012REST API DesignSenior

Your team is evolving a public REST API used by 300 enterprise clients. A breaking change in the data model is required to support a new compliance regulation. Clients have 6 months to migrate but cannot tolerate downtime. Walk through your versioning and deprecation strategy.

Q013Packaging & DependenciesMid

A CI pipeline that builds a Python service Docker image started failing after a colleague's PR that didn't change any Python files. The `pip install -r requirements.txt` step errors with a dependency conflict. Your `requirements.txt` has no pinned versions for transitive dependencies.

Q014Packaging & DependenciesSenior

Your company operates an internal PyPI server. A security scan reveals a package installed in production that was never in any `requirements.txt` — it appears to have been injected through a dependency confusion attack. Walk through how this happened and your hardening steps.

Q015Performance & ProfilingMid

A Python script that processes 10,000 JSON files nightly has grown from 8 minutes to 47 minutes over six months as data volume tripled. You need to identify and fix the bottleneck in one business day.

Q016Performance & ProfilingSenior

A Python ML inference service achieves 120 req/s on a 32-core machine but CPU utilization is only 18%. The model runs in PyTorch. Your task is to identify why throughput is capped and propose a path to 400+ req/s.

Q017Testing (pytest)Mid

Your pytest suite has 1,200 tests and takes 14 minutes to run locally. A developer joining the team says they skip running tests because of the wait time. Walk through how you cut runtime to under 3 minutes without removing tests.

Q018Testing (pytest)Senior

A new engineer on your team writes tests that pass individually but fail when run in a specific order as part of the full suite. You suspect test pollution but the engineer insists their test is correct. How do you diagnose and enforce isolation going forward?

Q019Celery & Task QueuesMid

A Celery worker processing email notifications is consuming 100% CPU on a single core. Task throughput is 12/second but emails only send at ~3/second. Investigation shows tasks are frequently retrying. What are your diagnostic and fix steps?

Q020Celery & Task QueuesSenior

Your e-commerce platform uses Celery with Redis broker. During a Black Friday sale, 2 million tasks queue up and workers slow to a crawl. Memory on the Redis instance hits 95% and some tasks are being silently dropped. Post-mortem the failure modes and redesign for resilience.

Q021Typing & PydanticMid

You join a team maintaining a Flask API where all request/response handling is done with raw `dict` access. Senior engineers complain about runtime KeyError bugs and hard-to-review PRs. You propose adding Pydantic. Walk through the migration strategy for an existing endpoint.

Q022Typing & PydanticSenior

Your FastAPI service has a `Union[TypeA, TypeB]` Pydantic model for an endpoint that accepts two distinct payload shapes. In production, all TypeA payloads are being silently coerced to TypeB, causing data corruption. Explain the root cause and fix.

Q023CachingMid

A product catalog API serves 50,000 RPM. After adding Redis caching with a 5-minute TTL, cache hit rate is only 40% despite the catalog changing less than once per hour. Investigation shows the cache is mostly empty. Where is the bug and how do you fix it?

Q024CachingSenior

Your Redis cluster hit a cache stampede during a product launch: a high-traffic cached key expired and 4,000 concurrent requests simultaneously queried the database, causing it to go down. Redesign the caching layer to prevent this class of failure.

Q025SecurityMid

A penetration test on your Django REST API finds that it's vulnerable to mass assignment: sending extra fields in a POST request updates fields that should be read-only (e.g., `is_admin`, `account_balance`). Fix this systematically.

Q026SecuritySenior

Your Python API service processes user-uploaded Excel files and uses `openpyxl` to extract data. A security review flags the risk of XXE and formula injection. You also discover that file processing runs in the same process as the web server. Describe your remediation.

Q027Production DebuggingMid

Your Django app is throwing `IntegrityError: null value in column 'user_id' violates not-null constraint` in production. The stack trace points to a Celery task, not a view. The same code has been running for 3 months without issue. What has changed and how do you find it?

Q028Production DebuggingSenior

A Python microservice in Kubernetes occasionally restarts with OOMKill. The pod limit is 512MB. Heap profiling with `tracemalloc` shows only 180MB of Python objects. RSS is 480MB. Explain the 300MB gap and your investigation strategy.

Q029Core Python & Data ModelMid

A teammate's utility function caches expensive API results using a mutable default argument: `def fetch(key, cache={})`. It worked fine in development but causes stale data bugs in production where the service runs for days. Explain the issue and the idiomatic fix.

Q030DjangoSenior

Your Django project's test suite takes 45 minutes to run because of database setup. The suite has 3,000 tests and uses `TestCase` throughout. A colleague suggests switching entirely to `SimpleTestCase`. Why is this wrong, and what is the correct migration path?

Q031FastAPI/FlaskMid

You need to add request-level logging with a unique trace ID to an existing FastAPI application. The trace ID must appear in all log lines emitted during a request, including those from helper modules and third-party libraries that use the standard `logging` module. How do you implement this without modifying every log call?

Q032ORM & SQLAlchemyMid

A data sync job using SQLAlchemy bulk-inserts 500,000 rows per night. After 3 months, inserts have slowed from 8 seconds to 4 minutes. The table has grown to 80M rows. No new columns or indexes have been added. What do you investigate and fix?

Q033REST API DesignMid

A mobile team reports that your REST API's `GET /orders` endpoint returns 8MB payloads for users with large order histories. This causes timeout errors on 3G connections. The data model cannot be changed. How do you solve this without a major API rewrite?

Q034Packaging & DependenciesSenior

Your Python monorepo contains 12 services sharing internal libraries. A change to a shared library broke 3 downstream services that weren't in the PR's scope. Walk through your strategy for managing internal dependencies and preventing cross-service breakage.

Q035Performance & ProfilingSenior

A data science team's Python batch job processes 1TB of log data daily using pandas DataFrames. Memory usage peaks at 200GB, requiring an expensive EC2 instance. Your goal is to cut memory usage by 10x without changing the business logic.

Q036Testing (pytest)Senior

Your team's API has 95% test coverage but continues to ship regressions in edge cases. A post-mortem on three recent incidents shows that all failures involved specific combinations of valid inputs that no test covered. How do you improve the test strategy?

Q037Celery & Task QueuesSenior

Your platform sends transactional emails via Celery tasks. A deployment incident caused workers to be offline for 90 minutes. On recovery, 240,000 queued tasks ran simultaneously, overwhelming the email provider (rate limit: 100/s) and causing widespread delivery failures. How do you prevent this on the next incident?

Q038Typing & PydanticMid

Your team is adding type hints to a legacy 50,000-line Django codebase. Running `mypy` produces 2,400 errors. The codebase is actively developed and you can't freeze it for a 3-month typing sprint. How do you manage the migration?

Q039CachingSenior

A B2B SaaS platform caches user permission sets in Redis with a 15-minute TTL. A security vulnerability is reported: when a user's permissions are revoked, they can still access restricted resources for up to 15 minutes. The product team won't accept more than a 30-second revocation latency. How do you redesign the caching layer?

Q040SecuritySenior

Your Python service generates signed JWT tokens for service-to-service authentication. A red team exercise discovers that the service accepts tokens with `alg: none`, allowing unsigned tokens. The JWT library is PyJWT. Walk through the root cause and hardening.

Q041Production DebuggingSenior

A Python service's p99 latency spikes to 8 seconds every 4 hours, exactly. Outside the spikes, p99 is 120ms. The spikes last 90 seconds. No deployments happen during spikes. What is the most likely cause and how do you confirm and fix it?

Q042Core Python & Data ModelSenior

A platform engineering team is building a Python DSL for CI pipeline configuration. They need `Pipeline() >> StageA() >> StageB()` syntax to chain stages. Implement the `__rshift__` protocol correctly, and explain the pitfalls when stages are shared between pipeline definitions.

Q043Async & Concurrency (asyncio/GIL)Senior

A Python streaming API using asyncio receives 10,000 concurrent WebSocket connections. Memory grows linearly at 2MB per connection — 20GB total. Each connection runs an `asyncio.Task`. You need to serve 50,000 connections on the same hardware. Where is the memory going?

Q044DjangoMid

After a Django upgrade from 3.2 to 4.2, a production feature that sent emails on user registration stopped working silently. No errors appear in Sentry. The `send_mail` call is inside a `post_save` signal handler. What changed and how do you debug it?

Q045ORM & SQLAlchemySenior

You're designing a multi-version document storage system in PostgreSQL using SQLAlchemy. Documents are versioned on every edit; the latest version must be queryable with a single indexed lookup. Propose the schema and ORM design, noting trade-offs.

Q046Async & Concurrency (asyncio/GIL)Mid

Your asyncio service runs a mix of CPU-bound (image thumbnail generation) and I/O-bound (S3 upload) operations per request. You've tried `asyncio.gather` but CPU work still blocks the event loop. Explain why and fix the architecture.

Q047Testing (pytest)Mid

Your team writes API integration tests against a third-party payment gateway. Tests fail intermittently because the external API has rate limits and occasional downtime. Test runs block CI for 20 minutes waiting for the external service. How do you fix this?

Q048SecurityMid

During a code review, you notice a Django view that constructs a SQL query using f-string interpolation of a `request.GET` parameter. The endpoint is publicly accessible. Walk through the exploitation risk and how to fix it without changing the feature.

Q049Production DebuggingMid

A Python Flask API returns correct data for 99% of requests but one endpoint intermittently returns a previous user's data. The bug is reported by 3 users. The endpoint reads from a local variable populated in a `before_request` hook. What is the most likely cause?

Q050Core Python & Data ModelSenior

A configuration management library uses `__init_subclass__` and a class-level registry dict to auto-register all configuration classes. After a large refactor, some config classes are no longer registered. No errors occur. What are the failure modes and how do you make the registry robust?

Q051Packaging & DependenciesMid

Your Python service runs in a Docker container. The image builds successfully but the container exits immediately in production with `ModuleNotFoundError: No module named 'uvicorn'`. The same image works in staging. Both environments pull from the same registry. What do you investigate?

Q052Performance & ProfilingMid

A Python service that generates PDF reports takes 40 seconds per report. The product team wants it under 5 seconds. Profiling shows 35 seconds in font loading and image embedding, which your PDF library does on every call. The same fonts and images are used in 95% of reports.

Q053Celery & Task QueuesMid

Your team uses Celery to run data processing tasks. A task that processes user-uploaded files occasionally fails with `OSError: [Errno 28] No space left on device` on the worker. The disk usage on the worker is high but the actual data being processed is small. Where is the disk space going?

Q054Typing & PydanticSenior

You are building a plugin system where third-party plugins implement a `PluginProtocol` with `load()`, `run(data: dict)`, and `unload()` methods. You need runtime validation that an arbitrary object satisfies the protocol, and typed signatures for plugin authors. Design the typing architecture.

Q055Production DebuggingSenior

A Python microservice's health check returns 200 but it stops processing requests after 6–8 hours of operation. Restarting the service restores normal operation. No exceptions appear in logs. Memory and CPU look normal. What systematic approach do you take?

Q056Core Python & Data ModelSenior

Your e-commerce platform's cart service returns different totals when the same items are added in different orders due to floating-point rounding. Customers are complaining about £0.01 discrepancies at checkout. How do you diagnose and permanently fix this without rewriting the entire pricing engine?

Q057Core Python & Data ModelMid

A colleague's PR uses a mutable default argument in a Flask view helper: `def apply_filters(data, filters=[])`. In code review, you flag it as a bug. The author says it works fine in their local tests. Walk through exactly why this is wrong and what the correct pattern is.

Q058Async & Concurrency (asyncio/GIL)Senior

Your FastAPI data-ingestion service processes 500 concurrent webhook webhooks per second. Under load, you observe that CPU usage on a single core spikes to 100% while other cores sit idle, and p99 latency climbs to 8 seconds. Describe your diagnosis process and the architectural changes you'd make.

Q059Async & Concurrency (asyncio/GIL)Mid

You're writing an asyncio script that fetches 1,000 URLs concurrently using aiohttp. After deploying, you see 'Too many open files' errors and the remote servers start rate-limiting your IP. What's wrong and how do you fix it?

Q060DjangoSenior

A Django monolith serving 2M daily users has a 'recently viewed products' feature that runs a raw SQL query joining five tables on every page load. The query takes 400ms average and is called 50M times daily. The team wants to add caching but the product owner insists the data must always be current. How do you architect a solution?

Q061DjangoMid

While reviewing a Django pull request, you find that a view calls `User.objects.filter(email=request.POST.get('email'))` and then checks `if user.is_staff` to gate access to an admin action. What security vulnerabilities do you identify and how do you fix them?

Q062FastAPI/FlaskSenior

You're building a FastAPI service for a fintech startup that will handle payment webhooks from Stripe. The CTO wants sub-50ms response times and reliable processing even if your database goes down for 30 seconds. Design the webhook handler architecture.

Q063FastAPI/FlaskMid

Your Flask API is randomly returning 500 errors under moderate load. Looking at the logs you see 'QueuePool limit overflow' from SQLAlchemy. You've set `SQLALCHEMY_POOL_SIZE=5`. Explain what's happening and how you configure the pool correctly for a gunicorn deployment.

Q064ORM & SQLAlchemySenior

A data analytics dashboard in a Django application loads slowly because its main queryset triggers 847 SQL queries for a page showing 50 records. Your profiling tool shows this. You have one sprint to fix it without rewriting the underlying data model. What's your approach?

Q065ORM & SQLAlchemyMid

You're using SQLAlchemy Core with PostgreSQL. A colleague writes a function that opens a connection, runs a query, and returns results — but never explicitly closes the connection. Code review time: what are the risks and what's the correct pattern?

Q066REST API DesignSenior

You're designing an API for a logistics platform where clients need to track shipment status. Some clients poll every second, others need real-time push updates, and some only check hourly. You have FastAPI and Redis available. Design an API that handles all three access patterns efficiently.

Q067REST API DesignMid

A mobile app team is complaining that your REST API returns entire user objects (80+ fields) when they only need name, avatar_url, and last_seen for a contacts list screen. They're on slow 3G networks. How do you solve this without maintaining two separate endpoints?

Q068Packaging & DependenciesSenior

Your Python microservice Docker image takes 12 minutes to build and is 1.4 GB in size. The CI pipeline runs 40 times a day. Engineers are complaining about slow feedback loops. Diagnose what's likely wrong and walk through the optimizations to get to under 2 minutes and under 200 MB.

Q069Packaging & DependenciesMid

A new developer on your team runs `pip install -r requirements.txt` and gets a different version of a transitive dependency than production, causing a bug that took 4 hours to diagnose. What process and tooling changes do you implement to prevent this?

Q070Performance & ProfilingSenior

A Python data pipeline processes insurance claim records: it reads 500,000 CSV rows, applies business rules, joins with a reference table in memory, and writes results. The current runtime is 47 minutes. You have one week to get it under 5 minutes. Walk through your profiling and optimization strategy.

Q071Performance & ProfilingMid

A Django API endpoint that generates a PDF report is taking 8 seconds and timing out for some users. Looking at New Relic traces, you see the entire 8 seconds is inside one function called `generate_report()`. You have no profiler installed. How do you find the slow line?

Q072Testing (pytest)Mid

You're tasked with writing tests for a function that calls `datetime.datetime.now()` internally to generate timestamps for audit logs. Every time you run the test, the timestamp is different, making assertions impossible. How do you test this properly?

Q073Celery & Task QueuesSenior

Your Celery worker fleet is processing email notifications. During a 2-hour database outage, 450,000 tasks accumulate in the queue. When the database recovers, all workers immediately retry at full speed, re-triggering the outage. How do you design the system to handle this restart-storm scenario?

Q074Celery & Task QueuesMid

A background Celery task that processes user-uploaded images is sometimes running twice on the same image, causing duplicate watermarks and corrupted files. The task uses `@shared_task`. How do you diagnose and fix the duplicate execution problem?

Q075Typing & PydanticSenior

Your team is migrating a large Flask API from Pydantic v1 to Pydantic v2. The app has 200+ models. You need to minimize runtime breakage and keep the migration to a two-week sprint. What's your migration strategy?

Q076Typing & PydanticMid

A new API endpoint accepts a JSON body representing a financial transaction. A colleague uses `amount: float` as the Pydantic field type. You raise a concern in code review. What's wrong and what's the correct Pydantic type definition?

Q077CachingSenior

Your product catalog API serves 50,000 requests per minute. You add Redis caching with a 10-minute TTL. On Monday morning when the cache expires simultaneously for 8,000 product keys, your database crashes from the sudden load spike. Diagnose and architect a solution to prevent this.

Q078CachingMid

You're implementing Django's `cache.get_or_set()` for user profile data. A team member points out that the pattern is not atomic and could serve stale data to some users during the computation window. What's the exact race condition and how do you handle it?

Q079SecuritySenior

You perform a security audit on a FastAPI service that accepts file uploads for a document processing platform. Users can upload PDFs and ZIP files. Describe the full threat model and the defense-in-depth measures you'd implement before the service goes to production.

Q080SecurityMid

A penetration tester reports that your Django REST API returns detailed SQLAlchemy tracebacks including table names, column names, and partial query strings to API clients when a database error occurs. How do you fix this across the entire application?

Q081Production DebuggingSenior

Your Python API service has been running fine for three weeks. Starting Monday, memory usage climbs steadily from 200MB to 2GB over 6 hours, then the process is OOM-killed. The cycle repeats. You have no dedicated memory profiler installed. How do you diagnose the leak?

Q082Production DebuggingMid

A Django production deployment starts returning HTTP 500 errors for all requests. The previous deploy was 20 minutes ago. You have SSH access to the server and access to the application logs. Walk through your systematic diagnosis and rollback decision process.

Q083Core Python & Data ModelSenior

You're building a plugin system for a data pipeline tool where users can register custom transformation functions at startup. A senior engineer suggests using `__init_subclass__` for auto-registration. Explain how you'd implement this, what edge cases exist, and when you'd use a decorator-based registry instead.

Q084Async & Concurrency (asyncio/GIL)Senior

You're debugging a FastAPI service where `asyncio.sleep(0)` is scattered throughout a codebase by a previous developer. A junior engineer asks you to remove them all as dead code. Explain whether this is safe, what their purpose is, and what the correct approach is.

Q085DjangoSenior

A healthcare platform using Django needs to implement row-level security so that doctors can only read patients assigned to their clinic, even if a bug in application code constructs an incorrect queryset. The compliance team says 'application-level filtering is not sufficient.' How do you implement database-level row security with Django?

Q086DjangoMid

You need to add a non-nullable column `tenant_id` to an existing Django model that has 500,000 rows in production. Running the migration naively would lock the table for minutes. Describe a zero-downtime migration strategy.

Q087ORM & SQLAlchemySenior

A legacy SQLAlchemy application has no explicit transaction management — each ORM call auto-commits. The team wants to introduce a multi-step workflow (deduct inventory → create order → charge payment) where all three steps must be atomic. The existing codebase has 400 files. How do you introduce proper transaction management without touching all 400 files?

Q088REST API DesignSenior

Your public REST API for a SaaS product needs to support API versioning. You have 50 enterprise clients on v1, and v2 has breaking changes to 3 endpoints. Some clients can't migrate for 6-12 months due to procurement cycles. Design the versioning strategy and sunset policy.

Q089Packaging & DependenciesMid

You're releasing an internal Python library used by 8 microservices. After publishing version 2.1.0, three services start failing in production. The error is 'AttributeError: module has no attribute process_event'. Your 2.0.0 had this function. What happened and how do you prevent this going forward?

Q090Performance & ProfilingSenior

A trading platform's risk calculation service processes 10,000 portfolios every 30 seconds. Currently written in pure Python, it takes 28 seconds — dangerously close to the 30-second window. You can't change the data model or the algorithmic logic. What are your options?

Q091Testing (pytest)Mid

Your team wants to add property-based testing to a function that parses ISO 8601 date strings. A junior developer says 'we already have 20 example-based tests, property testing is redundant.' How do you explain the value and show a concrete example?

Q092Celery & Task QueuesSenior

Your Celery task processes user data export requests, which take 2-15 minutes depending on data volume. Users need real-time progress updates (e.g., '45% complete, 3 of 7 files processed'). The current implementation just returns 'Processing...' until done. Design the progress tracking system.

Q093Typing & PydanticMid

A FastAPI endpoint decorated with `response_model=UserResponse` is leaking the user's hashed password to API clients. Your Pydantic model for User has a `hashed_password` field. Explain exactly why this happens and the correct pattern to prevent it.

Q094CachingSenior

You're designing a caching layer for a multi-tenant SaaS API. Each tenant has different data, so you can't use a shared cache key. After launching, you notice tenant A's data occasionally appears in tenant B's response. Diagnose the cache key collision and design a bulletproof multi-tenant caching strategy.

Q095SecuritySenior

You're reviewing a FastAPI application that accepts a `template_name` parameter from users to select an email template. The code uses Jinja2 to render the template with user-supplied data. Security team flags this as a Server-Side Template Injection risk. Confirm if it's vulnerable and design the defense.

Q096SecurityMid

A Django application stores API keys in the database that third-party services use to authenticate. A developer proposes to hash them with SHA-256 before storage. The security lead rejects this approach. Explain why and describe the correct implementation.

Q097Production DebuggingSenior

Distributed tracing shows that one specific FastAPI endpoint has 99th-percentile latency of 12 seconds while its median is 40ms. The endpoint fetches a user's activity timeline. The slow cases all involve users with more than 10,000 activity events. What's the likely cause and how do you fix it?

Q098Production DebuggingMid

Your Python service is raising 'RecursionError: maximum recursion depth exceeded' in production on seemingly simple requests. The stacktrace shows the same function repeating 999 times. The function works fine in development and unit tests. What are the likely causes and how do you debug it?

Q099Core Python & Data ModelMid

A data science team's script reads 500MB of log files into a Python list, processes them, then exits. Memory usage peaks at 3GB. A teammate says 'it's just how Python works.' Is this accurate and what would you change?

Q100FastAPI/FlaskSenior

You're building a FastAPI dependency injection system for a SaaS application where every endpoint needs to: verify JWT auth, load the current user from DB, check their subscription plan, and apply rate limiting specific to their plan tier. These run on every request. How do you structure the dependencies to maximize code reuse and testability?

Q101DjangoMid

A Django signals handler connected to `post_save` for the Order model sends a confirmation email. In load testing, you discover it's creating a database deadlock. Two concurrent requests create orders, and both trigger the signal, which queries the Order table in a nested transaction. How do you fix this?

Q102ORM & SQLAlchemyMid

Your team uses SQLAlchemy and has just added a new polymorphic relationship using `polymorphic_on` and `polymorphic_identity`. In the CI tests, everything passes, but in production, queries for the polymorphic subclasses return empty results even though the data is there. What's the likely cause?

Q103REST API DesignMid

Your team debates whether to use HTTP 200 or HTTP 204 when a DELETE request succeeds. One developer always returns `{'status': 'deleted'}` with a 200. Another returns 204 with no body. A third wants to return the deleted resource for client convenience. Who is correct?

Q104Performance & ProfilingMid

A Python script that generates daily reports by reading from a PostgreSQL database runs in 45 seconds. Your manager wants it under 10 seconds. Without changing the database schema or adding hardware, what profiling and optimization steps do you take?

Q105CachingMid

You implement `@cache` (Python's functools.lru_cache) on a Django view helper that fetches configuration data. Two weeks later, a client reports their config changes aren't taking effect for hours. What happened and how do you fix this without removing caching?

Q106Production DebuggingSenior

Your Python application's Celery workers start failing with 'kombu.exceptions.EncodeError: Object of type UUID is not JSON serializable' — but only in production, not staging. The tasks were working 3 days ago. You didn't change the task code. What do you investigate?

Q107Async & Concurrency (asyncio/GIL)Mid

A junior developer writes a FastAPI endpoint that uses `asyncio.run()` inside an async route handler to call a sync function that has its own event loop management. The endpoint works sometimes but randomly raises 'This event loop is already running.' Explain the problem and the correct pattern.

Q108Typing & PydanticSenior

Your team wants to enable `mypy --strict` on a 50,000-line Django codebase that currently has no type annotations. The first run shows 2,400 errors. The CTO wants type safety but can't stop feature development for 3 months. Design a phased adoption strategy.

Q109Testing (pytest)Senior

Your team's integration tests hit a real third-party payment API (Stripe) in the test environment. Tests are flaky because Stripe's sandbox sometimes returns 500 errors and the tests take 45 seconds. How do you restructure this without losing confidence that the integration actually works?

Q110Celery & Task QueuesMid

You have a Celery beat scheduler configured to run a task every 5 minutes. After a deployment, you notice the task runs twice simultaneously every 5 minutes — one from the old beat process and one from the new one. Both write to the database, causing duplicate records. How do you fix this?

Q111Core Python & Data ModelSenior

Your fintech team's transaction ledger runs a nightly reconciliation that computes running balances across 2 million rows using a plain Python list of dicts. It's taking 40 minutes. A junior dev suggests switching to a dict of dicts indexed by account. How do you diagnose whether the bottleneck is memory layout, lookup cost, or something else entirely?

Q112Core Python & Data ModelMid

A colleague opens a PR where a class uses __slots__ to reduce memory on a model that will be instantiated millions of times per hour in an event-processing service. The reviewer comments that __slots__ will break pickling used by Celery. Who is right, and how do you resolve the conflict without abandoning either goal?

Q113Async & Concurrency (asyncio/GIL)Senior

Your real-time bidding service written in asyncio handles 80,000 auction events per second on a single host. Under load, you observe that some coroutines stall for 200–400 ms even though no coroutine awaits an external I/O call. CPU is at 60%. What are the three most likely causes and how do you isolate each?

Q114Async & Concurrency (asyncio/GIL)Mid

You're writing an asyncio-based data ingestion pipeline that must fetch data from 500 external REST endpoints concurrently. Using asyncio.gather(*all_500_tasks) at once causes connection resets and rate-limit 429s. How do you redesign the concurrency pattern to stay within API limits while maximising throughput?

Q115DjangoSenior

A Django e-commerce platform with 4 million SKUs uses a Category model with a recursive parent foreign key. Fetching the breadcrumb path for a product page requires up to 8 self-joins and is causing 600 ms database latency. The DBA refuses to add recursive CTEs. How do you solve this within the Django ORM?

Q116DjangoMid

A Django REST API used by a mobile app returns a 200 OK with an error message in the response body instead of an appropriate 4xx status code for validation failures. This breaks the mobile client's error-handling logic. How do you audit and fix this pattern systematically across a large codebase?

Q117FastAPI/FlaskSenior

Your FastAPI service processes healthcare document uploads. Under a stress test simulating 200 concurrent multipart file uploads each around 50 MB, the service OOMs at ~80 concurrent requests. Memory profiling shows multiple full file copies in RAM per request. How do you redesign the upload handler to stay within a 2 GB memory budget?

Q118FastAPI/FlaskMid

A Flask API used internally by a data science team has no request validation layer. Downstream Pandas code crashes with cryptic KeyErrors and TypeErrors when callers send unexpected payloads. You're tasked with adding validation without breaking the existing response contracts. What is your approach?

Q119ORM & SQLAlchemySenior

A logistics platform's shipment-tracking service uses SQLAlchemy Core to insert 10,000 location pings per second into Postgres. At peak load, INSERT throughput drops to 3,000/s and the DB shows heavy lock contention on the shipment_locations index. How do you redesign the write path to hit 10,000/s sustainably?

Q120ORM & SQLAlchemyMid

A Django-adjacent service uses SQLAlchemy sessions in a FastAPI app. In load testing, you observe that database connections are exhausted after 30 minutes even though the pool is sized at 20. Connection leak analysis shows sessions are being created but never closed. Identify the likely code pattern causing this and the correct fix.

Q121REST API DesignSenior

Your team is designing a public-facing API for a SaaS analytics platform. A product manager wants a single /report endpoint that accepts a 40-field JSON body and returns different response shapes depending on report_type. An architect argues for separate resource-oriented endpoints. The PM wants it shipped in two weeks. How do you adjudicate this conflict and what do you ship?

Q122REST API DesignMid

A mobile team reports that your API returns paginated product lists with offset/limit pagination, but users scrolling rapidly experience duplicate or missing products because the underlying sort order changes as products are updated. How do you migrate to a stable pagination scheme without breaking existing mobile clients?

Q123Packaging & DependenciesSenior

Your company runs 30 internal Python services. Each service pins its own requirements.txt. A critical security patch for cryptography drops and you need to update all 30 services within 48 hours. The current process requires manual PRs per repo. How do you redesign the dependency governance model to make this class of update take under 4 hours?

Q124Packaging & DependenciesMid

A new engineer joins and runs pip install -r requirements.txt in a fresh virtualenv, but the application crashes at startup with an ImportError on a package that is not in requirements.txt. The package is a transitive dependency of another library. How do you fix this and prevent it from recurring?

Q125Performance & ProfilingSenior

A Python ML inference service serving an XGBoost model has p99 latency of 800 ms. The model itself runs in 12 ms. Memory profiling shows 200 MB allocated per request. py-spy flame graphs show 60% of request time in json.loads and feature engineering code. You have a 100 ms p99 SLA. Walk through your optimisation strategy.

Q126Performance & ProfilingMid

A data pipeline job processes 10 GB CSV files in Python using pandas read_csv. On a 16-core machine, it saturates a single core and takes 45 minutes. The rest of the code is vectorised Pandas operations. How do you achieve near-linear scaling across all 16 cores without a full rewrite?

Q127Testing (pytest)Senior

Your team's pytest suite takes 22 minutes to run on CI. It has 1,800 tests across unit, integration, and DB tests. The build blocks on every PR. You're asked to get median PR CI time under 5 minutes without deleting tests. What is your plan?

Q128Testing (pytest)Mid

You're writing pytest tests for a service that sends transactional emails via SendGrid. A junior engineer's tests are hitting the real SendGrid API, costing $0.02 per run and occasionally flaking due to rate limits. How do you refactor the test approach and what specific pytest patterns do you apply?

Q129Celery & Task QueuesSenior

A media transcoding platform uses Celery with Redis broker. During a viral upload event, 500,000 tasks are queued in 10 minutes. Workers process at 2,000 tasks/minute. Redis memory spikes to 40 GB and the broker crashes, losing all queued tasks. How do you redesign the system to handle this traffic class without data loss?

Q130Celery & Task QueuesMid

Your Celery workers are processing payment-confirmation emails. Occasionally a task fails midway and Celery retries it, sending duplicate emails to customers. The task has no idempotency key. How do you implement idempotency to prevent duplicate sends without introducing a blocking database call on every task?

Q131Typing & PydanticSenior

A platform team is standardising on Pydantic v2 for all inter-service contracts. A legacy service emits JSON with inconsistent field naming (camelCase from JS, snake_case from Python services, and occasionally missing optional fields with null vs absent distinction). How do you build a Pydantic model layer that handles all three without duplicating schema definitions?

Q132Typing & PydanticMid

A colleague adds type hints to a critical data transformation function but does not run mypy in CI. Three weeks later, a production bug is traced to a function that claimed to return list[dict[str, int]] but actually returned a list with some None entries that the caller didn't handle. How do you set up a type-safety regime that would have caught this?

Q133CachingSenior

A high-traffic news platform's article detail page is cached in Redis for 5 minutes. When a trending article goes viral, thousands of requests arrive simultaneously the moment the cache key expires, all hitting Postgres. This thundering herd takes Postgres CPU to 100% for 30–45 seconds. How do you eliminate the thundering herd without sacrificing freshness SLA?

Q134CachingMid

A Django API caches expensive queryset results in Redis using django-redis. After a deployment that changes the queryset logic, stale cache entries return incorrect data for up to an hour. The team's current approach is to manually flush Redis after every deploy. How do you implement cache invalidation that doesn't require manual intervention?

Q135SecuritySenior

A security audit finds that your Django application is vulnerable to mass assignment: a PATCH /users/{id} endpoint allows callers to update any field including is_admin and last_login by including them in the JSON body. The codebase has 40 update endpoints. How do you remediate this systematically and prevent it from reappearing in new code?

Q136SecurityMid

A code review reveals that a Flask route accepts a filename parameter and uses it to construct a file path for download: open(f'/var/app/reports/{filename}'). A QA engineer demonstrates they can read /etc/passwd by sending filename='../../etc/passwd'. How do you fix this and what is the principle to apply for all file-serving endpoints?

Q137Production DebuggingSenior

A Python microservice running on Kubernetes exhibits gradual memory growth from 200 MB to 1.8 GB over 72 hours before the OOMKiller restarts it. The service processes webhooks. Heap profiling with tracemalloc shows the top allocations in CPython internals (dict resize, list append) with no clear application-code culprit. How do you locate and fix a Python memory leak of this type?

Q138Production DebuggingMid

A Django API starts returning 500 errors sporadically under load. Sentry reports are unhelpful: the stack trace points to a database connection error but the error message changes between occurrences. The DB server shows healthy metrics. What is your debugging approach and what are the most likely root causes?

Q139Core Python & Data ModelSenior

A Python library for scientific computation uses operator overloading on a Matrix class. A user reports that adding a scalar to a matrix (5 + matrix) raises a TypeError, but adding in the other direction (matrix + 5) works. What is the Python data model mechanism at play and how do you fix it?

Q140Async & Concurrency (asyncio/GIL)Senior

Your asyncio-based order-processing service uses a single event loop on a 32-core machine. CPU utilization never exceeds 6%. You are asked to scale throughput 10× on the same hardware using only Python processes. What is the architecture and what are the critical correctness constraints?

Q141DjangoSenior

A Django application's user-search endpoint executes a LIKE '%keyword%' query on a 50 million row users table. With full-text search disabled, the query takes 8 seconds. The DBA says adding a btree index won't help for leading-wildcard queries. How do you implement fast search within Django's ORM without migrating to Elasticsearch?

Q142FastAPI/FlaskSenior

A FastAPI service acts as a gateway aggregating responses from 5 downstream microservices. Under normal load it responds in 90 ms. When one downstream service degrades to 3-second responses, the entire gateway p99 jumps to 3 seconds and active connection count climbs to thousands. How do you implement bulkhead and timeout patterns to contain the blast radius?

Q143ORM & SQLAlchemyMid

A Flask app with SQLAlchemy lazily loads a User model's orders relationship inside a route. In development this works fine, but in production with gunicorn, you intermittently see DetachedInstanceError: Instance is not bound to a Session. What is the root cause and how do you fix it?

Q144REST API DesignSenior

A product team wants to add a bulk-delete endpoint: DELETE /orders with a JSON body containing an array of order IDs. An architect flags that DELETE with a body is non-standard and proxies may strip it. The team wants to delete up to 10,000 orders in one call. How do you design this endpoint correctly?

Q145Packaging & DependenciesMid

Your Python project has grown to include three services and a shared utility library in a single Git repository. Each service has its own requirements.txt and frequently gets out of sync with the utility library's version. How do you restructure the monorepo's dependency management to ensure the library version is always consistent across services?

Q146Performance & ProfilingSenior

A Python service serialises large domain objects to JSON for API responses. Under load, json.dumps is consuming 35% of total CPU according to py-spy. The objects are complex: nested dataclasses with datetime fields, Decimal fields for currency, and custom Enum types. What is your optimisation strategy and what are the trade-offs?

Q147Testing (pytest)Senior

A data engineering team's pytest suite for ETL pipelines uses real S3 and Postgres connections in all tests. New developers can't run tests locally without AWS credentials and a running database. Test setup takes 10 minutes per run. How do you redesign the test architecture for local developer productivity while keeping integration coverage?

Q148Celery & Task QueuesSenior

A Celery-based recommendation engine recalculates user recommendations on every purchase event. With 200,000 daily active users making 3 purchases each, this creates 600,000 tasks per day. Workers are frequently saturated. 80% of tasks are re-calculations for the same user within a 5-minute window, wasting CPU. How do you redesign the task dispatch strategy?

Q149Typing & PydanticMid

Your team is migrating a legacy codebase from Pydantic v1 to Pydantic v2. The codebase uses validator decorators, orm_mode=True, and .dict() extensively. How do you plan and execute the migration safely in a codebase of 200 models without introducing silent data corruption?

Q150CachingSenior

A multi-tenant SaaS platform caches API responses in Redis. A new enterprise tenant's data appears in another tenant's cached response. Post-mortem shows that cache keys included only resource IDs and not tenant IDs. How do you audit and remediate tenant data leakage via cache, and what governance prevents recurrence?

Q151SecuritySenior

A Python service accepts YAML configuration files uploaded by admin users and loads them with yaml.load(). A security researcher demonstrates remote code execution by uploading a YAML file containing Python object serialisation payloads. How do you remediate this immediately, and what is the broader policy for safe YAML handling in Python?

Q152Production DebuggingSenior

A Python batch job processes 50 million rows overnight and completes successfully, but the next morning the operations team reports that 120,000 rows are missing from the output. The job has no error logs. How do you add the instrumentation and debugging tooling to locate the data loss in a job that takes 6 hours to reproduce?

Q153Core Python & Data ModelMid

A Python web scraper uses a list to accumulate results from 5,000 pages and then deduplicates using a second pass with 'if item not in results_list'. Deduplication takes 3 minutes for 2 million items. How do you explain the performance problem and rewrite the deduplication to run in under 1 second?

Q154DjangoMid

A Django application sends welcome emails synchronously in the user registration view. During a marketing campaign that drove 5,000 sign-ups in an hour, the view timed out and users received 504 errors even though accounts were created. How do you decouple email sending from the registration flow without breaking the user experience?

Q155FastAPI/FlaskMid

A Flask API uses flask-jwt-extended for authentication. A tester discovers that a JWT token remains valid after the user logs out. The tokens have a 24-hour expiry. How do you implement token revocation without requiring a database query on every request?

Q156ORM & SQLAlchemySenior

A reporting service uses SQLAlchemy to build dynamic queries from user-supplied filter combinations (date range, status, region, assigned_user). A code review reveals that some filter values are interpolated directly into query strings using Python f-strings rather than bound parameters. How do you audit for SQL injection and refactor safely?

Q157Performance & ProfilingMid

A Python service builds a large in-memory graph of 500,000 nodes and 2 million edges using plain Python dicts and lists. Construction takes 8 minutes and uses 12 GB RAM. The team needs it to fit in 4 GB and build in under 2 minutes. How do you approach this without switching to a graph database?

Q158SecurityMid

A Django API for a healthcare application logs full request and response bodies to help debug integration issues with a hospital partner. A compliance officer flags that patient names and diagnosis codes are appearing in plaintext in the application logs, which are shipped to an ELK stack accessible to engineers. How do you fix this immediately and structurally?

Q159Celery & Task QueuesMid

A Celery task that generates PDF reports is being called from a Django view with .delay(). Users complain that sometimes reports are never generated and there is no error visible in the UI. Worker logs show the tasks disappear without a trace. How do you instrument the task lifecycle to identify and fix the silent failure?

Q160Typing & PydanticSenior

Your team builds an event-driven architecture where 20 different event types are published to a Kafka topic as JSON. Consumers need to deserialise events correctly based on the event_type field. Using a manual if-elif chain to instantiate different Pydantic models is growing unmaintainable. How do you implement a typed, extensible event dispatcher?

Q161CachingMid

A Python service caches a user's permission set in Redis for 10 minutes to avoid repeated database queries. When an admin revokes a user's access, the user continues to have access for up to 10 minutes. The business requires access revocation to take effect within 10 seconds. How do you redesign the caching strategy?

Q162Production DebuggingMid

A FastAPI service running on a single VM shows CPU at 5% and memory at 20%, but p99 latency is 4 seconds. Adding more workers doesn't help. The endpoint makes no database calls. Removing a specific call to an external partner API drops latency to 80 ms. The partner API responds in 200 ms on average. How do you diagnose and fix the 4-second tail latency?

Q163DjangoSenior

Your Django application is deployed to AWS with an Aurora Postgres cluster. During end-of-month billing runs, the database CPU hits 100% for 20 minutes, blocking all other traffic. The billing query joins 7 tables and scans 40 million invoice rows. You cannot move billing to a separate database. How do you isolate the billing workload without hardware changes?

Q164Async & Concurrency (asyncio/GIL)Mid

A Python script using concurrent.futures.ThreadPoolExecutor to parallelise CPU-intensive image resizing across 8 threads on an 8-core machine shows no speedup over single-threaded execution. The developer expected 8× speedup. Why does this happen and what is the correct tool?

Q165REST API DesignMid

A mobile app calls your API to submit a user's geolocation update every 5 seconds. With 100,000 active users, this is 20,000 requests per second to a single /location endpoint. The current implementation writes each update synchronously to Postgres. The database is struggling at 15,000 writes/second. How do you reduce database pressure while keeping location data fresh?

Q166Testing (pytest)Mid

A data science team's model-training pipeline is tested with a single integration test that takes 25 minutes to run the full training loop. PRs sit unreviewed because nobody wants to wait 25 minutes for CI. How do you add fast, meaningful test coverage without disrupting the existing full integration test?

Q167Core Python & Data ModelSenior

Your fintech startup's transaction-reconciliation script runs nightly and intermittently produces incorrect totals. A colleague suspects Python's floating-point arithmetic is the root cause. Walk through how you would diagnose and permanently fix the precision issue without rewriting the entire pipeline.

Q168Core Python & Data ModelMid

You inherit a data-processing library where several classes use __slots__ inconsistently. Some subclasses omit __slots__ and the memory savings are lost. Explain how you would audit the codebase, fix the inheritance chain, and verify the fix with a memory benchmark.

Q169Async & Concurrency (asyncio/GIL)Senior

A real-time bidding service written in asyncio starts dropping bids under load. Profiling shows the event loop is blocked for 80-120ms at irregular intervals. The team suspects a third-party sync SDK is the culprit. How do you confirm this and fix it without replacing the SDK?

Q170Async & Concurrency (asyncio/GIL)Senior

You are migrating a CPU-bound image-processing microservice from threading.Thread to multiprocessing.Pool. After the migration, throughput is lower than expected and memory usage tripled. Diagnose the cause and propose a corrected architecture.

Q171DjangoSenior

A Django e-commerce platform's order-creation endpoint intermittently raises IntegrityError on a unique constraint even though the application logic checks for duplicates before inserting. The database is PostgreSQL with a 10-node Django cluster. Explain the race condition and fix it.

Q172DjangoMid

You are asked to add full-text search to a Django blog with 500,000 posts stored in PostgreSQL. The team wants results in under 200ms at p95. Outline your implementation using Django's built-in search features before reaching for Elasticsearch.

Q173FastAPI/FlaskSenior

Your FastAPI service handling insurance-claim uploads starts returning 502s from the Nginx gateway during large-file uploads. The uvicorn logs show no errors. You suspect a timeout mismatch. Walk through the full request lifecycle and where you would set each timeout value.

Q174FastAPI/FlaskMid

A Flask application running in production sends duplicate emails to users who click 'Subscribe' twice rapidly. The view uses a simple DB insert before sending the email. How would you make this endpoint idempotent without adding a distributed lock?

Q175ORM & SQLAlchemySenior

A data pipeline using SQLAlchemy Core to bulk-load 5 million rows from CSV into PostgreSQL takes 4 hours. A colleague claims switching to execute_many will halve the time. What would you actually do, and what order of magnitude improvement can you achieve?

Q176ORM & SQLAlchemyMid

Your Django application using SQLAlchemy (standalone, not Django ORM) suddenly starts throwing TimeoutError acquiring a connection from the pool under moderate load. The pool size is set to 5. Walk through your diagnosis and the correct pool parameters to set.

Q177REST API DesignSenior

A mobile team reports that your Python REST API's paginated list endpoints are unusable because adding a new item between page fetches causes items to be skipped or duplicated. The API uses page-number pagination with an ORDER BY created_at. Propose a durable fix.

Q178REST API DesignMid

You are designing a Python REST API for a healthcare application. A field on the patient record is nullable in the DB but a downstream service treats null and missing key as different states. How do you design the response serialization to preserve this distinction?

Q179Packaging & DependenciesSenior

Your team's CI pipeline takes 18 minutes to install dependencies because pip resolves the full dependency graph on every run. The project has 120 direct and transitive dependencies. Propose a concrete strategy to get install time under 2 minutes without sacrificing reproducibility.

Q180Packaging & DependenciesMid

A colleague submits a PR that adds a new library by pinning it as my_library==1.2.3 directly in requirements.txt. You notice the library has a transitive dependency that conflicts with your existing stack. Walk through how you catch this conflict and enforce better practices going forward.

Q181Performance & ProfilingSenior

A Python data science pipeline that joins two Pandas DataFrames (10M × 50 cols and 500k × 20 cols) consumes 28GB of RAM during the merge and crashes on a 32GB EC2 instance. Walk through your profiling and optimization approach.

Q182Performance & ProfilingMid

A Python web scraper that was processing 1,000 pages per hour suddenly drops to 200 per hour after a dependency upgrade. You have no profiling data from before the upgrade. How do you identify which upgrade caused the regression and instrument the code for ongoing performance visibility?

Q183Testing (pytest)Senior

Your test suite of 800 tests takes 12 minutes to run on CI. Developers are skipping the test run locally. Describe your strategy to get wall-clock time under 3 minutes while maintaining confidence in the suite.

Q184Testing (pytest)Mid

You need to write a pytest test for a function that sends an SMS via a third-party provider. The provider's Python SDK makes an HTTPS request internally. You want to test the logic without making real API calls or exposing credentials in CI. What is your approach?

Q185Celery & Task QueuesSenior

Your Celery-based video-processing pipeline has a memory leak: worker processes grow to 4GB over 8 hours and are killed by the OOM killer. Tasks process 10-50MB video files. The team has tried setting worker_max_tasks_per_child but the leak persists. Describe your diagnostic steps.

Q186Celery & Task QueuesMid

A Celery task that sends a welcome email is being executed three times for some new user registrations. The task is set to retry on failure with max_retries=3. What are the likely causes and how do you make the task idempotent?

Q187Typing & PydanticSenior

You are migrating a large FastAPI codebase from Pydantic v1 to Pydantic v2. The CI passes locally but several validators decorated with @validator break silently in production because v2 changed validator execution order and the cls argument behavior. Describe your migration strategy.

Q188Typing & PydanticMid

A FastAPI endpoint accepts a JSON body with a discriminated union field: the payload can be either a BankTransfer or a CryptoTransfer, distinguished by a payment_method field. Show how you model this with Pydantic v2 and explain how FastAPI uses it for validation and OpenAPI generation.

Q189CachingSenior

A Python API serving product catalog data uses Redis as a cache. After a full catalog refresh (100k products updated in the DB), the service experiences a cache stampede for 90 seconds, causing the database to absorb 50x normal query load. Propose a concrete fix.

Q190CachingMid

A Django application caches user profile data in Redis with a per-user key. After a user updates their profile, the cache is not invalidated and users see stale data for up to 5 minutes. The team disagrees on whether to use cache.delete or cache.set with a shorter TTL. Walk through the trade-offs.

Q191SecuritySenior

A Python microservice receives a YAML configuration file from an external partner and uses yaml.load() to parse it. A security audit flags this as a critical vulnerability. The team lead says replacing the parser would break existing configs. Explain the vulnerability and your migration path.

Q192SecurityMid

Your Python Flask API returns raw database error messages (including table names and column names) to API clients when SQL queries fail. A penetration tester marks this as a medium-severity finding. How do you fix it without losing debugging information internally?

Q193Production DebuggingSenior

A Python FastAPI service running in Kubernetes starts returning intermittent 500 errors that don't appear in the application logs. The errors only occur under load and disappear after a pod restart. You have no local reproduction. What is your investigation strategy?

Q194Production DebuggingMid

Users report that a Python background job that exports CSV reports is silently producing truncated files. The job runs in a Celery worker, writes to S3, and always exits with code 0. Walk through how you would find the truncation point.

Q195Core Python & Data ModelSenior

A colleague proposes using a class-level mutable default for a configuration dict in a library component to avoid re-parsing on every instantiation. Two weeks after merging, a bug report shows that modifying the config on one instance silently mutates the config on all other instances. Explain the root cause and design a correct solution.

Q196Async & Concurrency (asyncio/GIL)Mid

A junior developer on your team writes an async FastAPI endpoint that calls three independent external APIs sequentially with await. The endpoint takes 900ms. Each call individually takes about 300ms. How do you fix this and what pitfalls should the developer avoid?

Q197DjangoMid

A Django admin site is exposed on /admin/ on a public-facing server. A security scanner flags admin enumeration via the login endpoint (valid vs invalid usernames return different HTTP responses). How do you harden the admin without removing it?

Q198FastAPI/FlaskSenior

A FastAPI application experiences WebSocket connection drops after exactly 60 seconds of inactivity. The infrastructure team says the AWS ALB has a fixed 60-second idle timeout. The product requires connections that stay alive for up to 30 minutes. Propose a full solution.

Q199ORM & SQLAlchemySenior

A reporting query using SQLAlchemy ORM joins five tables and returns 50,000 rows, each with 30 columns. Loading them into ORM objects takes 8 seconds. The same raw SQL query in psql returns in 400ms. Diagnose the bottleneck and fix it.

Q200REST API DesignSenior

Your team is designing a Python REST API for a multi-tenant SaaS platform. A new engineer proposes filtering tenant data with ?tenant_id=123 as a query parameter on every endpoint. Explain the security and design problems with this and propose a better approach.

Can you defend these answers under follow-up pressure?

Book a mock interview with a senior Python Developer mentor — structured scorecard, replay, and a gap plan.

Book a Mock Interview →

Python Developer Interview Questions

Book your free audit