STM32 MPU Config || #3. Cache Policies

Name: STM32 MPU Config || #3. Cache Policies
Uploaded: 2025-08-07T14:19:58+00:00
Duration: 11 min 11 s

0:02
[Music]
0:09
welcome to another video of controllers
0:11
tech
0:13
this is the third video in the series of
0:15
mpu configuration
0:17
and today we will see the cache policies
0:20
let's start with the cache itself cache
0:23
is a technique of storing a copy of data
0:26
into a very fast memory
0:28
very fast means whether read and write
0:31
can take place at very high speed
0:34
cortex m7 processors uses the level one
0:38
cache to increase the performance
0:41
there are two types of cache used in
0:43
stm32
0:44
and they are instruction cash and data
0:47
cash
0:48
let's see some of the terms that we are
0:50
going to use in this video
0:53
first one is cash hit and cash miss
0:57
if we are performing the right operation
0:59
the cache hit occurs
1:01
if the address to be written is found in
1:03
the cache memory
1:05
and if the address is not present in the
1:07
cache it's termed as cache miss
1:10
in case of the read operation the cache
1:12
hit occurs if the requested data is
1:14
found in the cache
1:16
and if the data is not present in the
1:18
cache it's called cache miss
1:21
another term that we are going to use is
1:23
dirty bit
1:25
dirty bit indicates whether the cache
1:28
has been modified
1:29
or not modified every cache block
1:32
contains one dirty bit
1:34
generally whenever the master writes the
1:37
data into the cache
1:38
it sets the dirty bit to indicate the
1:40
modification
1:42
the data right from cache to memory is
1:44
only done
1:45
if the dirty bit is set this way the
1:48
rights to the memory are reduced
1:51
now let's talk about the cache policies
1:53
in cortex m7 processors
1:56
here we have four cache policies right
1:58
through
1:59
right back right allocate and read
2:02
allocate
2:03
let's start with right through as it is
2:06
the simplest one
2:08
in the right through policy the data is
2:10
simultaneously updated in the cache
2:12
and to the memory take a look at the
2:15
flow diagram
2:17
we will keep the focus on the writing
2:19
part
2:20
if there is cache hit the data will be
2:23
written to the cache
2:24
and then to the memory and if there is
2:27
cache miss
2:28
then the data is directly written to the
2:30
memory
2:31
basically no matter what the data is
2:34
written to the memory
2:36
in a single instruction this method is
2:39
useful to handle the data coherency
2:42
but here we are performing more writes
2:44
to the memory and that defies the entire
2:46
purpose of using
2:47
cache next is the write-back policy
2:52
here if there is cache hit the master
2:54
will write the data to the cache
2:56
and set the dirty bit the data can be
2:59
updated later in the memory
3:02
to be precise the data will be written
3:04
to the memory
3:05
only when the new data is about to be
3:07
written in the cache
3:09
but if there is cache miss that means if
3:11
the address to be written is not present
3:13
in the cache
3:14
then it completely depends on the right
3:16
allocate
3:18
in write allocate the data is loaded
3:20
from memory into cache
3:22
and then it's updated in the cache and
3:25
the dirty bit is set
3:26
so that the next cycle can get a cache
3:29
hit and the memory can be updated with
3:31
new data
3:33
this is exactly how it's described in
3:35
the diagram on the right
3:37
in case of the cache hit the data is
3:40
written to the cache
3:41
and the dirty bit is updated in case of
3:44
the miss
3:45
it will first locate some cache block
3:48
which can be used for this purpose
3:51
if the block is already dirty the
3:53
previous data will be written to its
3:55
respective memory
3:56
and then it proceeds further with
3:58
copying data from memory to this
4:00
cache but if the block is not dirty
4:03
it will copy the data from the memory to
4:05
this cache
4:07
then the data is updated in the cache
4:10
and the dirty bit is set
4:11
so that the next cycle will update the
4:13
data to the memory
4:15
in stm32 the write-back policy is mostly
4:19
used with write allocate
4:22
next is the read allocate every cachable
4:25
location in cortex
4:27
m7 is red allocate in case of the cache
4:30
miss
4:31
the cache lines are allocated for that
4:33
particular data
4:34
so the next access to cache will be a
4:36
hit
4:38
here is the picture from sd's document
4:40
about the policies used in stm
4:42
32 cortex m7 we have right through with
4:47
no write allocate
4:49
in case of the hit the data will be
4:51
updated in the cache
4:53
and the memory but in case of miss
4:56
the data will be updated in the memory
4:58
and that memory block will not be copied
5:00
into the cache
5:01
since this is no right allocate then
5:04
there is right back with no
5:06
right allocate in case of the hit
5:09
the cache will be written and the dirty
5:11
bit will be set
5:13
the main memory is not updated instantly
5:16
that's how the write back works
5:19
in case of miss the data will be updated
5:21
in the memory
5:22
and that memory block will not be copied
5:24
into the cache
5:25
since this is no right allocate the last
5:29
one is right back with write and read
5:31
allocate
5:32
in case of the hit the cache will be
5:35
written and the dirty bit will be set
5:38
the main memory can update later that's
5:40
how the write back works
5:43
in case of the miss the memory block is
5:45
updated
5:46
and the block is copied into the cache
5:49
this is because now we have both
5:51
read and write allocate here is the
5:54
cache policies for the memory locations
5:58
remember that every cachable region is
6:00
red allocate by default
6:02
out of these locations we are most
6:05
interested
6:06
in the sram region if you remember the
6:09
data coherency we talked about
6:11
where the cpu and dma are not coherent
6:13
in the cachable region
6:15
this kind of issues takes place in the
6:17
right-back policy regions and that is
6:20
the sram
6:22
we will see some cases of data coherency
6:24
issues now
6:25
and also how to solve these issues
6:28
i made this pdf by collecting some
6:30
important things from one of the
6:32
microchips document
6:34
the link to the original document is at
6:37
the bottom of this pdf
6:39
let's start with the first issue when
6:42
the dma writes the data into the sram
6:45
here dma is copying data from the
6:47
peripheral into the sram
6:49
and cpu is trying to copy this data from
6:52
sram to some other location
6:55
also note that the cache policy of sram
6:57
is right back
6:58
with read and write allocate dma
7:01
reads data from peripheral and updates
7:04
the data into the receive buffer in sram
7:07
but since we are using data cache cpu
7:10
will read the data from the cache
7:12
which hasn't been updated so we have the
7:15
data coherency issue here
7:17
to solve this we can invalidate the
7:20
cache
7:21
let's see how again
7:24
dma will read the data from the
7:26
peripheral
7:27
and writes it in the receive buffer as
7:30
soon as dma is finished
7:32
we will invalidate the cache for the
7:34
receive buffer region
7:41
now when cpu tries to access the cache
7:44
it's not available
7:45
so there will be a cache miss
7:53
since the sram have read allocate policy
7:56
in case of the cache miss
7:58
a cache line will be allocated in the
8:00
cache memory and the receive buffer will
8:02
be copied there
8:04
now when the cpu tries to access this
8:06
cache
8:07
it will have the same data as the
8:09
receive buffer
8:11
this is how the coherency issue can be
8:13
solved
8:14
using the cache invalidate
8:18
here is the sample code but this is as
8:21
per the microchips protocol
8:24
but we will just focus on what's
8:26
happening instead of the functions they
8:28
are using
8:29
this handler is called when the dma
8:32
finished the transfer
8:34
we have transfer complete callback for
8:37
that
8:38
inside the handler we have to invalidate
8:41
the cache by address
8:43
the address is the address of the
8:45
receive buffer and the size is the
8:47
transfer size
8:48
or the buffer size this function is
8:51
exactly the same across all cortex
8:54
m7 devices in the main function
8:57
we can wait for the transfer to finish
8:59
and then copy the data using the cpu
9:03
the next issue is when dma reads the
9:06
data from the sram
9:08
here the cpu updates the data in the
9:10
transmit buffer
9:11
but due to write back policy the memory
9:14
is not updated until the next write
9:17
now when dma read the cache it will
9:19
always read the old data and not the
9:22
latest one
9:24
i hope you remember when i mentioned
9:26
this
9:27
that the data to the memory is written
9:29
when the new data is about to be written
9:31
in the cache
9:33
due to this there is always coherency
9:35
issue between the cpu write
9:37
and dma read we can solve this by
9:41
cleaning the cache
9:43
cpu writes the data into the cache
9:46
cache clean operation is performed to
9:49
flush the cache
9:50
into the sram now dma read the data
9:54
which will be coherent with the cpu
9:56
right
9:57
this is the sample program first
10:00
data is being copied into the transmit
10:03
buffer
10:04
then we will perform clean cache by
10:06
address
10:07
and finally enable the dma transfer
10:11
this way the dma can read the latest
10:14
data from the transmit buffer
10:15
and the coherency issue can be solved
10:19
so we saw we can use cache invalidate
10:22
and cache clean to solve the data
10:24
coherency issues
10:26
we can also just make the region
10:27
non-cachable through the mpu
10:29
configuration
10:30
and it will work just fine i will show
10:33
this entire coherency issue in the next
10:36
video
10:36
where we will see some practical usage
10:38
of mpu
10:40
here is the link to the source document
10:43
you can read it for more detailed
10:45
explanation
10:47
this is it for this video you can
10:50
download the files from the link in the
10:52
description
10:53
keep watching and have a nice day ahead
11:10
you

STM32 MPU Config || #3. Cache Policies

ControllersTech

STM32 IoT with ESP8266 (Part 3) – MQTT Connect and Publish Data

STM32 IoT with ESP8266 (Part 2) – Send Sensor Data to ThingSpeak Cloud

STM32 + ESP8266 Tutorial (Part 1) | WiFi Setup & Obtain IP Address

Fingerprint Authentication with STM32 & R307 | Enroll, Store & Match FP Data

ESP32 MQTT in Action – How to Subscribe & Publish with ESP-IDF

ESP32 IoT Project: ThingSpeak Data Logging Made Easy

STM32 + ADXL345 + I2C + OLED | Real-Time Accelerometer Tutorial

Master Servo Control on STM32 with PWM & HAL Libraries

STM32 SPI Tutorial: ADXL345 Accelerometer + LCD Display Integration

Up next in 10

STM32 MPU Config || #3. Cache Policies

ControllersTech