How DRAM Bank Interleaving Boosts Memory Performance
Understanding DRAM Bank Interleaving Fundamentals
Bank interleaving is a critical technique embedded in every DRAM chip of Dual Inline Memory Modules (DIMMs). By enabling parallel data access across multiple memory banks, it significantly reduces latency and—when combined with prefetching—dramatically increases memory operating speeds.
The core principle lies in staggering operations across banks. When a DRAM read request initiates, a bank requires recovery time after outputting a data burst. Interleaving allows successive banks to initiate new bursts without waiting, creating a continuous data flow. As the last bank finishes its burst, the first bank becomes ready again, eliminating idle cycles.
The Physics Behind Latency Reduction
DRAM cells require:
- Row activation time (tRCD): 10-20ns to open a row
- Precharge time (tRP): 10-20ns to close a row
- Burst cycle time: 2.5-5ns per data transfer
Without interleaving, these delays compound. With 8 banks interleaving bursts of 4 bytes:
Bank 0: [Activate] → [Burst] → [Precharge]
Bank 1: [Activate] → [Burst] → [Precharge]
Throughput increases by 8X while hiding 97% of latency.
Address Bit Mapping: The Engine of Interleaving
Bank interleaving relies on strategic memory address allocation. Consider a model with:
- 8 banks (3-bit address)
- 8 rows (3-bit address)
- 8 columns (4-bit address)
The address bits are mapped:
| Bits 8-6 | Bit 5 | Bits 4-2 | Bits 1-0 |
|----------|-------|----------|----------|
| Row | Col | Bank | Col LSBs |
Why this "peculiar" separation? Splitting column bits enables two critical functions:
- Bits 1-0 increment during bursts (0→1→2→3)
- Bit 5 toggles between burst sequences
- Bank bits change fastest between accesses
Real-World Burst Sequence Walkthrough
- Initial address 000000000 (bin)
- Row:0, Bank:0, Col:00 → Accesses columns 0-3
- Next address 000000100 (+4)
- Row:0, Bank:1, Col:00 → Next burst
- Address 000100000
- Row:1, Bank:0, Col:00 → New row access
Key insight: The bank address changes most frequently, distributing workload evenly across physical banks while row/column addresses change slowly. This minimizes row activations—the most latency-intensive operation.
Advanced Implementation Considerations
Scaling Principles
Modern DIMMs scale this model using:
- 16 banks (DDR4/DDR5) vs 8 banks (older models)
- Burst lengths of 8 (BL8) or 16 (BL16)
- 2D/3D stacked DRAM with bank groups
Critical rule: Bank count and burst length must be powers of two (8, 16, 32) for efficient bit mapping.
Performance Tradeoffs
| Config | Latency | Throughput | Power |
|---|---|---|---|
| 8 banks | Higher | 64GB/s | Lower |
| 16 banks | 34% lower | 128GB/s | 22% higher |
| 32 banks | 41% lower | 256GB/s | 38% higher |
DDR5 innovation: Bank groups act as "super banks" allowing simultaneous activation across groups, further reducing contention.
Practical Implementation Toolkit
5-Step Design Checklist
- Calculate minimum banks:
Banks ≥ (tRCD + tRP) / Burst Interval - Align burst length to cache line size (64B = BL8)
- Verify address bits:
log2(banks)dedicated bits - Place bank bits immediately after row bits
- Validate timing:
tFAW < 4 * tACT(four activate window)
Recommended Resources
- JEDEC DDR5 Standard (JESD79-5) - Authoritative specs for bit mapping
- Micron TN-ED-03: DDR4 Bank Group Architecture - Real-world latency analysis
- DRAMSim3 - Open-source simulator for prototyping configurations
"Bank interleaving isn't just about parallelism—it's about orchestrating timing dependencies to transform staggered operations into a seamless flow."
What's your biggest challenge in memory design? Share whether it's bank count limitations, address mapping constraints, or timing validation in the comments below!