-
Concurrency in High-Traffic Services
Q: You’re tasked with improving the performance of a high-traffic REST API handling 10k requests/sec. How would you approach concurrency in Python?
A:
- Use an async framework(FastAPI/Starlette) with async/await and an ASGI server (Uvicorn) for non-blocking I/O.
- Leverage thread poolsfor CPU-bound tasks (via futures) and asyncio for I/O-bound tasks.
- Avoid the GIL limitation by offloading heavy computations to subprocesses or C extensions.
-
Debugging Memory Leaks
Q: A long-running data processing script consumes increasing memory. How would you diagnose and fix it?
A:
- Use tracemallocor objgraph to track object allocations.
- Check for circular references(e.g., in graphs) and break them with weakref.
- Ensure generators/context managers release resources (e.g., withstatements for file handles).
-
Metaclasses in ORM Design
Q: How would metaclasses help in designing an ORM?
A:
Metaclasses dynamically create database schemas by intercepting class definitions.
For example:
- python
- class ModelMeta(type):
- def __new__(cls, name, bases, attrs):
- fields = {k: v for k, v in attrs.items() if isinstance(v, Field)}
- cls.fields = fields
- return super().__new__(cls, name, bases, attrs)
- class User(metaclass=ModelMeta):
- name = CharField()
This auto-registers fields for SQL table generation.
-
Optimizing Python with C Extensions
Q: How would you optimize a performance-critical Python function?
A:
- Rewrite the bottleneck in Cythonor use ctypes/cffi to call C libraries.
- Example: Use numbafor JIT compilation of numerical code.
- Profile with cProfileto identify hotspots before optimization.
-
Singleton Implementation
Q: Implement a thread-safe Singleton in Python.
A:
- Python
- from threading import Lock
- class SingletonMeta(type):
- _instances = {}
- _lock = Lock()
- def __call__(cls, *args, **kwargs):
- with cls._lock:
- if cls not in cls._instances:
- cls._instances[cls] = super().__call__(*args, **kwargs)
- return cls._instances[cls]
Use metaclass to enforce a single instance with thread safety.
-
Resolving Deadlocks in Threading
Q: A threaded application deadlocks. How would you debug it?
A:
- Use faulthandleror pdb to trace stuck threads.
- Analyze lock acquisition order with Lockor use timeouts (lock.acquire(timeout=5)).
- Replace mutexes with asyncioor multiprocessing queues for safer concurrency.
-
Securing REST APIs
Q: Design a secure authentication system for a REST API.
A:
- Use OAuth2with JWT tokens (via Authlib or FastAPI’s OAuth2PasswordBearer).
- Store hashed passwords (bcrypt) and enforce HTTPS.
- Rate-limit endpoints with slowapiand validate inputs via Pydantic.
-
Async Migration Strategy
Q: How would you migrate a synchronous codebase to async?
A:
- Incrementally replace blocking calls (e.g., HTTP requests, DB queries) with async equivalents (e.g., aiohttp, asyncpg).
- Use run_in_executorfor legacy blocking code.
- Refactor with async/awaitand adopt an async-first framework (e.g., FastAPI).
-
Memory-Efficient Data Processing
Q: Process a 100GB CSV file on a machine with 16GB RAM.
A:
- Use generatorsand pandas with chunksize:
- Python
- for chunk in pd.read_csv(‘data.csv’, chunksize=10_000):
- process(chunk)
Alternatively, use Daskfor out-of-core computation.
-
Distributed Task Processing
Q: Design a system to process 1M tasks across 100 workers.
A:
- Use Celeryor Dask Distributed with a message broker (Redis/RabbitMQ).
- Partition tasks into queues and monitor with Flower.
- Implement idempotency and retries for fault tolerance.
-
Dynamic Class Attributes with Descriptors
Q: Implement type-checked attributes.
A:
- Python
- class TypedAttribute:
- def __init__(self, type_):
- self.type_ = type_
- def __set_name__(self, owner, name):
- self.name = name
- def __set__(self, instance, value):
- if not isinstance(value, self.type_):
- raise TypeError(f”Expected {self.type_}”)
- instance.__dict__[self.name] = value
- class Person:
- age = TypedAttribute(int)
Descriptors enforce type safety at runtime.
-
Database Interaction Best Practices
Q: Ensure atomic transactions in a banking app.
A:
- Use SQLAlchemy’s sessionwith with session.begin(): for atomicity.
- Set isolation levels (e.g., REPEATABLE READ) and handle exceptions with rollbacks.
- Test with pytest-sqlalchemyand use connection pooling.
-
Real-Time Data Processing
Q: Process live sensor data streams.
A:
- Use Kafkawith confluent-kafka for message streaming.
- Process data with asyncioor Apache Flink (via pyflink).
- Optimize with sliding windows and stateful processing.
-
Dependency Management
Q: Manage conflicting dependencies in a large project.
A:
- Use Poetryor pipenv for deterministic builds.
- Isolate environments with virtualenvor Docker.
- Version pinning (txt) and dependency resolution.
-
Risks of Monkey Patching
Q: When would you avoid monkey patching?
A:
Avoid in shared codebases due to:
- Unintended side effects (e.g., overriding core methods).
- Debugging complexity and version incompatibilities.
- Prefer subclassing or composition instead.
-
Retry Mechanisms
Q: Implement a retry for flaky API calls.
A:
Use tenacity:
- Python
- from tenacity import retry, stop_after_attempt, wait_exponential
- @retry(stop=stop_after_attempt(3), wait=wait_exponential())
- def call_api():
- response = requests.get(…)
- response.raise_for_status()
-
Testing Legacy Code
Q: Add tests to untested legacy code.
A:
- Start with integration testsusing pytest.
- Use mockto isolate components.
- Refactor incrementally with test coverage (pytest-cov).
-
Parallelizing CPU-Bound Tasks
Q: Parallelize a CPU-heavy task across 8 cores.
A:
Use multiprocessing.Pool:
- Python
- with Pool(8) as p:
- results = p.map(process_data, data_chunks)
Or leverage joblib.Parallel for scikit-learn workflows.
-
Python in Microservices
Q: Integrate Python into a microservices architecture.
A:
- Use FastAPIfor lightweight, async-ready services.
- Containerize with Docker and orchestrate via Kubernetes.
- Implement observability (OpenTelemetry) and API gateways.
-
Optimizing Pandas for Large Datasets
Q: Speed up a slow Pandas pipeline.
A:
- Use eval()for vectorized operations.
- Convert object dtypes to categoryor datetime.
- Offload to Dask or Polars for parallel processing.