View the site in Français View the site in English (USA) Site displayed in English (GB)
You are here: ac6 > ac6-formation > IBM processors > PPC970FX implementation

PC2 PPC970FX implementation

This course covers the IBM Power 970FX Power G5 CPU

Objectives
bullet_jaune_1 The course details the pipeline operation in order to determine code optimization guidelines.
bullet_jaune_1 Data and instruction paths between SDRAM, L1 caches and L2 cache are highlighted.
bullet_jaune_1 MERSI cache coherency protocol is introduced in increasing depth.
bullet_jaune_1 The operation of the elastic bus is described.
bullet_jaune_1 Through a FFT algorithm, the instructor shows how to vectorize processing and reduce execution time using data streaming.
bullet_jaune_1 The performance monitor is used to optimize the performance of the FFT.

A more detailed course description is available on request at info@ac6-formation.com

Outline
OVERVIEW
bullet_jaune_2 Functional units
bullet_jaune_2 Key features
PPC970 PIPELINE
bullet_jaune_2 Pipeline basics
bullet_jaune_2 Deeply pipelined design, superscalar implementation, register renaming
bullet_jaune_2 Branch prediction mechanism
bullet_jaune_2 Instruction decode and preprocessing
bullet_jaune_2 Instruction dispatch, sequencing and completion control, register renaming
bullet_jaune_2 Dispatch group organization
bullet_jaune_2 Synchronization-based instruction grouping
bullet_jaune_2 Instruction latencies and throughputs
bullet_jaune_2 Software optimisation guidelines
MEMORY MANAGEMENT UNIT
bullet_jaune_2 MMU goals
bullet_jaune_2 Data address translation, 128-entry Data ERAT, ERAT Miss Queue
bullet_jaune_2 Second-level Memory Management Unit consisting of SLB and TLB
bullet_jaune_2 1024-entry 4-way set associative TLB, 64-entry fully associative SLB
bullet_jaune_2 Large page support
bullet_jaune_2 Real memory limit register
bullet_jaune_2 Hypervisor vs supervisor
bullet_jaune_2 Support for 32-bit operating systems
INTERNAL DATA PATHS
bullet_jaune_2 Data paths between load / store units, instruction queue, L2 and external bus
bullet_jaune_2 Out-of-order and speculative issue of load operations
bullet_jaune_2 32-entry real address based store queues
bullet_jaune_2 32-entry load re-order queue, tracking of the order of loads
bullet_jaune_2 8-entry load miss queue
bullet_jaune_2 GUS subsystem
bullet_jaune_2 Core Interface Unit
bullet_jaune_2 L2 cache controller
bullet_jaune_2 Non Cacheable Unit
bullet_jaune_2 Storage access ordering
bullet_jaune_2 Hardware controlled data prefetch
bullet_jaune_2 Prefetch startup sequence, stream detection
bullet_jaune_2 Synchronization instructions sync, lwsync, ptesync
L1 AND L2 CACHES
bullet_jaune_2 Cache basics
bullet_jaune_2 64 kB direct-mapped instruction cache
bullet_jaune_2 32 kB 2-way set associative data cache, FIFO replacement policy, Store-through policy
bullet_jaune_2 512 kB L2 cache, fully inclusive of L1 data caches, MERSI coherency protocol
bullet_jaune_2 Cache coherency, MERSI cache line state, cache state transition tables
PROGRAMMING
bullet_jaune_2 Branch instructions
bullet_jaune_2 The system call communication path between applications and RTOS
bullet_jaune_2 Integer load / store instructions
bullet_jaune_2 Integer arithmetic and logic instructions
bullet_jaune_2 IEEE754 basics
bullet_jaune_2 FPU operation : FPSCR register
bullet_jaune_2 Float load / store instructions, floating point exceptions
bullet_jaune_2 Float arithmetic instructions
bullet_jaune_2 The EABI
bullet_jaune_2 Code and data sections, small data areas benefits
bullet_jaune_2 970FX specific registers
THE PERFORMANCE MONITOR
bullet_jaune_2 Objectives
bullet_jaune_2 Event selection
bullet_jaune_2 Configuring the performance monitor bus
bullet_jaune_2 Instruction matching and sampling, the 3 stages of eligibility
EXCEPTION MECHANISM
bullet_jaune_2 Exception recognition and priorities
bullet_jaune_2 Focus on soft patch and maintenance exceptions
bullet_jaune_2 Registers updating according to the exception cause
bullet_jaune_2 Requirements to support exception nesting
bullet_jaune_2 Precise processing of machine check exceptions
VMX IMPLEMENTATION
bullet_jaune_2 VMX introduction, SIMD processing
bullet_jaune_2 Intra vs inter element instructions
bullet_jaune_2 VMX registers, VSCR initialization
bullet_jaune_2 ANSI C extension to support vector operators, new C types, new castings, vector declaration and initialization
bullet_jaune_2 VMX implementation on the PPC970FX
bullet_jaune_2 Data streams management
bullet_jaune_2 EABI extension to support VMX
POWER AND THERMAL MANAGEMENT
bullet_jaune_2 Clocking, PLL design
bullet_jaune_2 Time Base and decrementer
bullet_jaune_2 Frequency and voltage scaling
bullet_jaune_2 Additional dynamic power management
HARDWARE IMPLEMENTATION
bullet_jaune_2 Unidirectional point-to-point bus segments, source synchronized transfers
bullet_jaune_2 Packet protocols
bullet_jaune_2 Snoop response
bullet_jaune_2 Pipelined transactions
bullet_jaune_2 Power-on procedure
bullet_jaune_2 Electrical interface