View the site in Français View the site in English (USA) Site displayed in English (GB)
You are here: ac6 > ac6-formation > IBM processors > PPC970FX implementation
Download Catalog
Download Catalog
Download as PDF
Download as PDF
Write us
Write us
Printable version
Printable version
 

PC2 PPC970FX implementation

This course covers the IBM Power 970FX Power G5 CPU


formateur
Objectives
bullet_jaune_1 The course details the pipeline operation in order to determine code optimization guidelines.
bullet_jaune_1 Data and instruction paths between SDRAM, L1 caches and L2 cache are highlighted.
bullet_jaune_1 MERSI cache coherency protocol is introduced in increasing depth.
bullet_jaune_1 The operation of the elastic bus is described.
bullet_jaune_1 Through a FFT algorithm, the instructor shows how to vectorize processing and reduce execution time using data streaming.
bullet_jaune_1 The performance monitor is used to optimize the performance of the FFT.

A more detailed course description is available on request at info@ac6-formation.com

Outline
OVERVIEW
bullet_jaune_2 Functional units
bullet_jaune_2 Key features
PPC970 PIPELINE
bullet_jaune_2 Pipeline basics
bullet_jaune_2 Deeply pipelined design, superscalar implementation, register renaming
bullet_jaune_2 Branch prediction mechanism
bullet_jaune_2 Instruction decode and preprocessing
bullet_jaune_2 Instruction dispatch, sequencing and completion control, register renaming
bullet_jaune_2 Dispatch group organization
bullet_jaune_2 Synchronization-based instruction grouping
bullet_jaune_2 Instruction latencies and throughputs
bullet_jaune_2 Software optimisation guidelines
MEMORY MANAGEMENT UNIT
bullet_jaune_2 MMU goals
bullet_jaune_2 Data address translation, 128-entry Data ERAT, ERAT Miss Queue
bullet_jaune_2 Second-level Memory Management Unit consisting of SLB and TLB
bullet_jaune_2 1024-entry 4-way set associative TLB, 64-entry fully associative SLB
bullet_jaune_2 Large page support
bullet_jaune_2 Real memory limit register
bullet_jaune_2 Hypervisor vs supervisor
bullet_jaune_2 Support for 32-bit operating systems
INTERNAL DATA PATHS
bullet_jaune_2 Data paths between load / store units, instruction queue, L2 and external bus
bullet_jaune_2 Out-of-order and speculative issue of load operations
bullet_jaune_2 32-entry real address based store queues
bullet_jaune_2 32-entry load re-order queue, tracking of the order of loads
bullet_jaune_2 8-entry load miss queue
bullet_jaune_2 GUS subsystem
bullet_jaune_2 Core Interface Unit
bullet_jaune_2 L2 cache controller
bullet_jaune_2 Non Cacheable Unit
bullet_jaune_2 Storage access ordering
bullet_jaune_2 Hardware controlled data prefetch
bullet_jaune_2 Prefetch startup sequence, stream detection
bullet_jaune_2 Synchronization instructions sync, lwsync, ptesync
L1 AND L2 CACHES
bullet_jaune_2 Cache basics
bullet_jaune_2 64 kB direct-mapped instruction cache
bullet_jaune_2 32 kB 2-way set associative data cache, FIFO replacement policy, Store-through policy
bullet_jaune_2 512 kB L2 cache, fully inclusive of L1 data caches, MERSI coherency protocol
bullet_jaune_2 Cache coherency, MERSI cache line state, cache state transition tables
PROGRAMMING
bullet_jaune_2 Branch instructions
bullet_jaune_2 The system call communication path between applications and RTOS
bullet_jaune_2 Integer load / store instructions
bullet_jaune_2 Integer arithmetic and logic instructions
bullet_jaune_2 IEEE754 basics
bullet_jaune_2 FPU operation : FPSCR register
bullet_jaune_2 Float load / store instructions, floating point exceptions
bullet_jaune_2 Float arithmetic instructions
bullet_jaune_2 The EABI
bullet_jaune_2 Code and data sections, small data areas benefits
bullet_jaune_2 970FX specific registers
THE PERFORMANCE MONITOR
bullet_jaune_2 Objectives
bullet_jaune_2 Event selection
bullet_jaune_2 Configuring the performance monitor bus
bullet_jaune_2 Instruction matching and sampling, the 3 stages of eligibility
EXCEPTION MECHANISM
bullet_jaune_2 Exception recognition and priorities
bullet_jaune_2 Focus on soft patch and maintenance exceptions
bullet_jaune_2 Registers updating according to the exception cause
bullet_jaune_2 Requirements to support exception nesting
bullet_jaune_2 Precise processing of machine check exceptions
VMX IMPLEMENTATION
bullet_jaune_2 VMX introduction, SIMD processing
bullet_jaune_2 Intra vs inter element instructions
bullet_jaune_2 VMX registers, VSCR initialization
bullet_jaune_2 ANSI C extension to support vector operators, new C types, new castings, vector declaration and initialization
bullet_jaune_2 VMX implementation on the PPC970FX
bullet_jaune_2 Data streams management
bullet_jaune_2 EABI extension to support VMX
POWER AND THERMAL MANAGEMENT
bullet_jaune_2 Clocking, PLL design
bullet_jaune_2 Time Base and decrementer
bullet_jaune_2 Frequency and voltage scaling
bullet_jaune_2 Additional dynamic power management
HARDWARE IMPLEMENTATION
bullet_jaune_2 Unidirectional point-to-point bus segments, source synchronized transfers
bullet_jaune_2 Packet protocols
bullet_jaune_2 Snoop response
bullet_jaune_2 Pipelined transactions
bullet_jaune_2 Power-on procedure
bullet_jaune_2 Electrical interface