A translation lookaside buffer (TLB) is a CPU cache that memory management hardware uses to improve virtual address translation speed. All current desktop and server processors (such as x86) use a TLB to map virtual and physical address spaces, and it is ubiquitous in any hardware which utilizes virtual memory.
The TLB is typically implemented as content-addressable memory (CAM). The CAM search key is the virtual address and the search result is a physical address. If the requested address is present in the TLB, the CAM search yields a match quickly and the retrieved physical address can be used to access memory. This is called a TLB hit. If the requested address is not in the TLB, it is a miss, and the translation proceeds by looking up the page table in a process called a page walk. The page walk is an expensive process, as it involves reading the contents of multiple memory locations and using them to compute the physical address. After the physical address is determined by the page walk, the virtual address to physical address mapping is entered into the TLB.
A TLB has a fixed number of slots that contain page table entries, which map virtual addresses to physical addresses. The virtual memory is the space seen from a process. This space is segmented in pages of a prefixed size. The page table (generally loaded in memory) keeps track of where the virtual pages are loaded in the physical memory. The TLB is a cache of the page table; that is, only a subset of its content are stored.
The TLB references physical memory addresses in its table. It may reside between the CPU and the CPU cache, between the CPU cache and primary storage memory, or between levels of a multi-level cache. The placement determines whether the cache uses physical or virtual addressing. If the cache is virtually addressed, requests are sent directly from the CPU to the cache, and the TLB is accessed only on a cache miss. If the cache is physically addressed, the CPU does a TLB lookup on every memory operation and the resulting physical address is sent to the cache. There are pros and cons to both implementations. Caches that use virtual addressing have for their key part of the virtual address plus, optionally, a key called an "address space identifier" (ASID). Caches that don't have ASIDs must be flushed every context switch in a multiprocessing environment.
Implication For Performance:
The CPU has to access main memory for a:
- instruction cache miss
- data cache miss
- TLB miss
The third case (the simplest case) is where the desired information itself actually is in a cache, but the information for virtual-to-physical translation is not in a TLB. These are all about equally slow, so a program "thrashing" the TLB will run just as poorly as one thrashing an instruction or data cache. That is why a well functioning TLB is important.
 
