This course describes the architecture of Cortex-A5/A9 and provides coding guidelines
Objectives
This course aims to explain all low level characteristics of the Cortex-A9 that are required to develop efficient Kernel or application code.
MMU operation under Linux is described.
Spin-lock implementation in a multicore system is also detailed.
Interaction between level 1 caches, level 2 cache and main memory is studied through sequences.
The exception mechanism is explained, indicating how virtualization enables the support of several operating systems.
An overview of the Coresight specification is provided prior to describing the debug related units.
The operation of the Snoop Control Unit when supporting SMP is fully explained, particularly the utilization of cache tag mirrors, the advantage of connecting DMA channels to ACP and the sequences that have to be used to modify a page descriptor.
Support de cours au format PDF (en anglais) et une version imprimée lors des sessions en présentiel
Cours dispensé via le système de visioconférence Teams (si à distance)
Le formateur répond aux questions des stagiaires en direct pendant la formation et fournit une assistance technique et pédagogique
Au début de chaque demi-journée une période est réservée à une interaction avec les stagiaires pour s'assurer que le cours répond à leurs attentes et l'adapter si nécessaire
Tout ingénieur ou technicien en systèmes embarqués possédant les prérequis ci-dessus.
Les prérequis indiqués ci-dessus sont évalués avant la formation par l'encadrement technique du stagiaire dans son entreprise, ou par le stagiaire lui-même dans le cas exceptionnel d'un stagiaire individuel.
Les progrès des stagiaires sont évalués par des quizz proposés en fin des sections pour vérifier que les stagiaires ont assimilé les points présentés
En fin de formation, une attestation et un certificat attestant que le stagiaire a suivi le cours avec succès.
En cas de problème dû à un manque de prérequis de la part du stagiaire, constaté lors de la formation, une formation différente ou complémentaire lui est proposée, en général pour conforter ses prérequis, en accord avec son responsable en entreprise le cas échéant.
Plan du cours
Block diagram, 1 or 2 AXI master interfaces
Cortex-A9 variants: single core vs multicore
New memory-mapped registers in MPCore
Configurable options: cache size, Jazelle, NEON, FPU, PTM and IEM
States and modes
Benefit of register banking
Exception mechanism
Purpose of CP15
Superscalar pipeline operation
Branch prediction mechanism
Guidelines for optimal performance
Return stack
TrustZone conceptual view
Secure to non secure permitted transitions
Memory partitioning
Interrupt management when there is a mix of secure and non-secure interrupt sources
Boot sequence
Inter-Processor Interrupts
Barriers
Cluster ID
Exclusive access monitor
Spin-lock implementation
Using events
Data processing instructions
Branch and control flow instructions
Memory access instructions
Exception generating instructions
If…then conditional blocks
Stack in operation
Accessing special registers
Interworking ARM and Thumb states
Thumb-2EE extension for supporting interpreted languages
Using handlers to manage NULL pointers and array index that are outside a programmable range
MMU objectives
Page sizes
Page access permission, domain and page protection
Page attributes, memory types
Utilization of memory barrier instructions
Format of the external page descriptor table
Tablewalk
Abort exception, on-demand page mechanism
MMU maintenance operations
Using a common page descriptor table in an SMP platform, maintaining coherency of multiple TLBs
Cache organization
Supported maintenance operations
Write and allocate policies
Data prefetching
4-entry 64-bit merging store buffer
Understanding through sequences how cacheable information is copied from memory to level 1 and level 2 caches
Transient operations, utilization of line buffers LFBs, LRBs, EBs and STBs
Discarding a level 3 memory line load through merging writes into STBs
Cache event monitoring
Describing each maintenance operation
Cache lockdown, implementation of a small memory by a boot program
Interrupt management
Snooping basics
Cache-to-cache transfers
MOESI state machine
Address filtering
Understanding through sequences how data coherency is maintained between L2 memory and L1 caches
Accelerator Coherency Port
Event counting
Debugging a multi-core system with the assistance of the PMU
Cortex-A9 exception managemen
Interrupt groups: STI, PPI, SPI, LSPI
Assigning a security level to each interrupt source (Secure or Non Secure)
Prioritization of the interrupt sources
Distribution of the interrupts to the Cortex-A9 cores
Detailing the interrupt sequence
Benefits of CoreSight
Invasive debug, non-invasive debug, taking into account the secure attribute
APBv3 debug interface
Connection to the Debug Access Port
Debug facilities offered by Cortex-A9
Process related breakpoint and watchpoint
Program counter sampling
Event catching
PTM interface, connection to funnel
Cross-Trigger Interface, debugging a multi-core SoC
Placing code, data, stack and heap in the memory map, scatterloading
Reset and initialization
Placing a minimal vector table
Further memory map considerations, 8-byte stack alignment in handlers
Building and debugging an image
Long branch veneers
ARM compiler optimisations, tail-call optimization, inlining of functions
Mixing C/C++ and assembly
Coding with ARM compiler
Unaligned accesses
Local and global data issues, alignment of structures