Cypress CSC-1200T Guía de usuario Pagina 24

  • Descarga
  • Añadir a mis manuales
  • Imprimir
  • Pagina
    / 124
  • Tabla de contenidos
  • MARCADORES
  • Valorado. / 5. Basado en revisión del cliente
Vista de pagina 23
24 Cray T3E User’s Guide
3.5 Local memory hierarchy
The local four-level memory hierarchy of the processing elements is
shown in Figure 3.3. Nearest to the execution units are the registers.
Caches for instructions and data (ICACHE and DCACHE) are each of size
8 kB. The second-level cache, SCACHE (96 kB in total), is on the Alpha
chip. The fourth level of the memory hierarchy is the main (DRAM)
memory (128 MB).
Registers
and
functional
units
SCACHE
2nd level
cache
96 kB
streams
DRAM
memory
128 MB
ICACHE
instruction
cache
8 kB
DCACHE
data
cache
8 kB
Figure 3.3: The local memory hierarchy.
It takes 2 clock periods (cp) to start moving a value from the first level
data cache DCACHE to registers. The bandwidth is 16 bytes in a cp.
The size of the DCACHE is 8 kB, or 1024 words of 8 bytes. The cache
is divided into 256 lines of 32 bytes each. Each read operation allocates
one line in DCACHE for moving data from the 2nd level cache (SCACHE)
or the main memory. This means that four consecutive 64 bit words
are read at a time. Therefore, arrays should always be indexed using the
stride of one!
For example, if you have a loop which indexes array elements which
are 8 kB apart in memory, all the elements will be stored to the same
DCACHE position. Therefore the data has to be fetched from a lower
level of the memory hierarchy each time. This kind of memory reference
pattern slows down the program considerably.
The second level cache (SCACHE) is of size 96 kB. This cache is three-way
set-associative, which means that each location in the central memory
can be loaded to three different locations in the SCACHE. This mapping
is random and the programmer can not dictate it. Therefore, from the
programmer’s point of view, the SCACHE is actually of size 32 kB or a
third of the physical size.
Each part of the set-associative SCACHE is direct-mapped to the memory
in the same way as DCACHE is. You can fit 4096 words (each 8 bytes)
to each of the three parts of the SCACHE. The latency of SCACHE is 8 cp
for moving data to the DCACHE. The bandwidth is 16 bytes in a cp, or
Vista de pagina 23
1 2 ... 19 20 21 22 23 24 25 26 27 28 29 ... 123 124

Comentarios a estos manuales

Sin comentarios