View the site in Français View the site in English (USA) Site displayed in English (GB)
You are here: ac6 > ac6-formation > ARM cores > VFP programming
Download Catalog
Download Catalog
Download as PDF
Download as PDF
Write us
Write us
Printable version
Printable version
 

RC0 VFP programming

This course explains how to use VFP instructions to boost multimedia algorithms


formateur
Objectives
bullet_jaune_1 This course has been designed for programmers wanting to develop algorithm based on hardware floating point calculations.
bullet_jaune_1 Each instruction family is detailed, first at assembly level, and then at C level using macros.
bullet_jaune_1 Several tricky usage of vector instructions are provided.
bullet_jaune_1 The underlying cache operation as well as preload mechanisms (instruction and hardware prefetch) are detailed to explain how a processing can be pipelined .
bullet_jaune_1 The course shows how DSP typical algorithms such as FIR and FFT can be vectorized and then optimized to be executed on VFP unit.

bullet_jaune_1 THIS COURSE IS PROPOSED EITHER AS AN INSTRUCTOR-LED COURSE OR AS E-LEARNING.

bullet_jaune_1 ACSYS has developed an optimized VFP based FFT coded in assembler language
bullet_jaune_2 performance for 1024 complex floating point single precision samples is 220_000 core clock cycles (ARM11)
bullet_jaune_2 for any information contact guillaume.peron@ac6.fr
Labs are run under RVDS

A more detailed course description is available on request at info@ac6-training.com
Prerequisites
bullet_jaune_2 Knowledge of 4T / V5TE instruction set.

Outline
IEEE754 STANDARD
bullet_jaune_2 Floating point number coding
bullet_jaune_2 Denormalized numbers
bullet_jaune_2 NaN utilization
bullet_jaune_2 Rounding modess
bullet_jaune_2 VFP FPEXC register
INTRODUCTION TO VFPv3
bullet_jaune_2 Register bank, D registers, S registers
bullet_jaune_2 Instruction coding, either ARM or Thumb-2
bullet_jaune_2 Related system registers
bullet_jaune_2 Alignment issues
bullet_jaune_2 Context switching
VECTOR vs SCALAR OPERATION
bullet_jaune_2 Length / Stride combinations
bullet_jaune_2 Scalar operations
bullet_jaune_2 Vector operations
bullet_jaune_2 Mixed operations
VFP LOAD / STORE INSTRUCTIONS
bullet_jaune_2 Addressing modes
bullet_jaune_2 Floating point load / store
bullet_jaune_2 Floating point load / store multiple
bullet_jaune_2 Processor acceleration mechanisms: store merging buffers
ARITHMERICAL INSTRUCTIONS
bullet_jaune_2 Add / subtract / absolute value instructions
bullet_jaune_2 Multiply and multiply accumulate instructions
bullet_jaune_2 Divide instruction
bullet_jaune_2 Square root instruction
bullet_jaune_2 Compare instructions
bullet_jaune_2 Integer to FP and FP to convert instructions
VFP CODING EXAMPLES
bullet_jaune_2 FIR filter
bullet_jaune_3 Converting the scalar algorithm into a vector algorithm
bullet_jaune_3 Finding the VFP instructions to encode the vector algorithm
bullet_jaune_3 Optimizing the code
bullet_jaune_2 FFT (DFT)
bullet_jaune_3 Converting the scalar algorithm into a vector algorithm, understanding how circle properties can be used to process 4 angles concurrently
bullet_jaune_3 Finding the VFP instructions to encode the vector algorithm
bullet_jaune_3 Optimizing the code