View the site in Français Site displayed in English (USA) View the site in English (GB)
You are here: ac6 > ac6-formation > ARM cores > VFP programming

RC0 VFP programming

This course explains how to use VFP instructions to boost multimedia algorithms

Objectives
bullet_jaune_1 This course has been designed for programmers wanting to develop algorithm based on hardware floating point calculations.
bullet_jaune_1 Each instruction family is detailed, first at assembly level, and then at C level using macros.
bullet_jaune_1 Several tricky usage of vector instructions are provided.
bullet_jaune_1 The underlying cache operation as well as preload mechanisms (instruction and hardware prefetch) are detailed to explain how a processing can be pipelined .
bullet_jaune_1 The course shows how DSP typical algorithms such as FIR and FFT can be vectorized and then optimized to be executed on VFP unit.

bullet_jaune_1 THIS COURSE IS PROPOSED EITHER AS AN INSTRUCTOR-LED COURSE OR AS E-LEARNING.

bullet_jaune_1 ACSYS has developed an optimized VFP based FFT coded in assembler language
bullet_jaune_2 performance for 1024 complex floating point single precision samples is 220_000 core clock cycles (ARM11)
bullet_jaune_2 for any information contact guillaume.peron@ac6.fr
Labs are run under RVDS

A more detailed course description is available on request at info@ac6-training.com
Prerequisites
bullet_jaune_2 Knowledge of 4T / V5TE instruction set.

Outline
IEEE754 STANDARD
bullet_jaune_2 Floating point number coding
bullet_jaune_2 Denormalized numbers
bullet_jaune_2 NaN utilization
bullet_jaune_2 Rounding modess
bullet_jaune_2 VFP FPEXC register
INTRODUCTION TO VFPv3
bullet_jaune_2 Register bank, D registers, S registers
bullet_jaune_2 Instruction coding, either ARM or Thumb-2
bullet_jaune_2 Related system registers
bullet_jaune_2 Alignment issues
bullet_jaune_2 Context switching
VECTOR vs SCALAR OPERATION
bullet_jaune_2 Length / Stride combinations
bullet_jaune_2 Scalar operations
bullet_jaune_2 Vector operations
bullet_jaune_2 Mixed operations
VFP LOAD / STORE INSTRUCTIONS
bullet_jaune_2 Addressing modes
bullet_jaune_2 Floating point load / store
bullet_jaune_2 Floating point load / store multiple
bullet_jaune_2 Processor acceleration mechanisms: store merging buffers
ARITHMERICAL INSTRUCTIONS
bullet_jaune_2 Add / subtract / absolute value instructions
bullet_jaune_2 Multiply and multiply accumulate instructions
bullet_jaune_2 Divide instruction
bullet_jaune_2 Square root instruction
bullet_jaune_2 Compare instructions
bullet_jaune_2 Integer to FP and FP to convert instructions
VFP CODING EXAMPLES
bullet_jaune_2 FIR filter
bullet_jaune_3 Converting the scalar algorithm into a vector algorithm
bullet_jaune_3 Finding the VFP instructions to encode the vector algorithm
bullet_jaune_3 Optimizing the code
bullet_jaune_2 FFT (DFT)
bullet_jaune_3 Converting the scalar algorithm into a vector algorithm, understanding how circle properties can be used to process 4 angles concurrently
bullet_jaune_3 Finding the VFP instructions to encode the vector algorithm
bullet_jaune_3 Optimizing the code