Duo Core Processors
and Multiple Caches
What is the hype all about
Doug Willoughby
February 20, 2007
Agenda
What
is Duo Core and what are requirements for use
Description
of Cache memory
History
of cache development
Some
recent Developments
Some
Duo Core Specs
What is Duo Core
A
Processor Chip that has two processing units instead of one.
Built
into a single package
Can
run two applications or two processes simultaneously
News Flash February 12, 2007
Intel
announced an experimental chip design with 80 cores to give enormous
calculating power at low power requirements
Applications/Processes
Consist
of instruction sequences and data associated with each instruction
Most
are sequential; called a single thread
Some
applications can have multiple threads which run simultaneously.
A
single thread cannot take advantage of dual core; multiple threads can
Operating System Requirement
Must
implement task dispatcher that can handle:
multiple
threads of one application or
multiple
applications or
multiple
operating system processes or
combinations
of the above
Single Core Processing
Task
Manager implements preemptive multitasking
One
process or application runs until another higher priority task needs the
processor; then switch occurs
Can
also switch if one process encounters delay for data (I/O from CD/DVD/Internet)
Hyper Threading Technology
Running
multiple single thread applications through a single processor sharing unused
cycles
Compromise
technology between single and dual core technologies
Duo Core Technology
Running
multiple single thread applications through two processors
Of What Use is Dual Core
None
if running only one single thread application or process
Not
much if running multiple applications or processes with low processor
utilization
Great
if you have multiple high processor utilization applications which can run
simultaneously
News Flash February 19, 2007
AMD announces new Barcelona Quad Core processor chip.
Includes
four cores with supporting circuits on one chip
Intel
Quad Core puts 2 dual core chips on same module (Woodcrest and Clovertown)
L1 and L2 Cache
Why
include a cache at all
Cache
is much smaller, much faster, more expensive per bit memory
Interfaces
to off-chip RAM
RAM is much larger, much slower, cheaper per bit memory
Because
of Locality of Reference combo appears as faster memory at the lower cost
News Flash Feb 14, 2007
IBM announces a breakthrough that allows substitution of eDRAM in place of SRAM on the chips.
Dramatically
reduces space requirement on the chip for L2 cache and also L1 cache
Locality of Reference
Theory
that when applications operate, only a small kernel of instructions and data
are required at any one time.
If
stored in a fast small buffer and only go to larger slower RAM if not in buffer, the combo would operate at an average speed closer to the speed of the
buffer at a cost close to the cost of RAM
IBM Performance Evaluation
Complex
computer designs required complex simulations driven by instruction streams
Streams
were created by tracing real benchmark workloads
Traces
included addresses of instructions, instructions, addresses of data as well as
data itself
Cache Development
IBM Research extracted all the addresses from streams
Confirmed
that a small fast buffer could store most current instructions and data
All
instructions and data stored in slower RAM
Effect
was to appear that all data was accessible at near buffer speed if hit ratio in
the buffer was 96 to 99 percent
Cache Development
L1
cache was initially implemented in System 360 Model 85, (circa 1968) then
System 370 smaller models
Later
instruction and data caches were implemented.
Most
later computer designs and chip designs incorporate L1 and L2 caches as well as
data and instruction separation
Cache 1968 vs Today
System 360 Model 85 >
$1,000,000
32K byte Cache 80 ns cycle time
4M byte RAM 960 ns access time
Magnetic Core memory
Pentium 4 PC < $1000
32K byte L1 Cache 3 cycle access
4 M byte L2 Cache 12 cycle
access
1G byte RAM
Examples of Use
Adobe
Photoshop and Elements can run multiple threads.
Web
Page Servers
Gaming
applications
Applications
with high computing requirements
Grid
Computing applications