1
00:00:04,060 --> 00:00:10,883
Okay, so, now that we've gone through the
beginning excersises of what a directory

2
00:00:10,883 --> 00:00:16,906
based distributed shared memory machine
looks like. Let's talk about how to

3
00:00:16,906 --> 00:00:24,204
actually figure out where the directory
is. So you have an address. And usually

4
00:00:24,204 --> 00:00:28,055
these systems you don't want to do it on
the physical address spaces. You're not

5
00:00:28,055 --> 00:00:31,568
going to want to do this on virtual
addresses. You don't want to have to run

6
00:00:31,568 --> 00:00:35,852
this, This is because you're sharing data
between lots of different systems. At this

7
00:00:35,852 --> 00:00:39,462
point you're sort of, out of the system
bus. Your address is no longer virtual.

8
00:00:39,462 --> 00:00:43,217
You've gone, you've gone through the
translation look inside buffer or then MMU

9
00:00:43,217 --> 00:00:51,360
and you've figured out what the physical
address is. So, to figure out what the

10
00:00:51,620 --> 00:00:56,828
directory is, or sometimes called the, the
if in a distributed memory machine the

11
00:00:56,828 --> 00:01:04,260
home node. Or is the it's a number of
which one of these directories to go to.

12
00:01:07,840 --> 00:01:12,440
And there's a lot of different ways to do
this. But one of the more common ones, is

13
00:01:12,440 --> 00:01:19,160
to just use some bits out of the address.
So you take the number of directories in

14
00:01:19,160 --> 00:01:25,086
the system. Take the log base two of that.
And then, you take that number of bits to

15
00:01:25,086 --> 00:01:30,205
be the home node number. So when you take
a cache miss, and it's not in your cache.

16
00:01:30,205 --> 00:01:35,450
And you need to go figure out and do the
load of that data, we'll say. You send a

17
00:01:35,450 --> 00:01:40,316
message and the message ID and the
destination of that message will actually

18
00:01:40,316 --> 00:01:45,182
be the home node. And hopefully, your
interconnect knows how to route the data

19
00:01:45,182 --> 00:01:52,604
to that directory. Now, taking the high
outer bits has some benefits. Lets, lets

20
00:01:52,604 --> 00:01:57,411
take a look at that. As we discussed
already in a, in a non-linear form memory

21
00:01:57,411 --> 00:02:03,089
access architecture, the OS can control
the placement. I can do this because based

22
00:02:03,089 --> 00:02:08,691
on these high order bits, you can actually
determine where, which node in the system

23
00:02:08,691 --> 00:02:14,023
or which directory in the system you're
going to. So you can actually basically

24
00:02:14,023 --> 00:02:19,220
allocate memory, allocate your stack,
allocate your instruction space, based on

25
00:02:19,520 --> 00:02:24,627
the physical address and the OS commands
that. Cuz the OS has absolute authority

26
00:02:24,627 --> 00:02:31,742
over where physical a ddresses get doled
out to. Downside is a directory or a home

27
00:02:31,742 --> 00:02:39,029
node can become a hot spot. So let's say
all of a sudden, all of the processors in

28
00:02:39,029 --> 00:02:46,326
your system try to access one page of
memory. There's like a, a hot page which

29
00:02:46,326 --> 00:02:51,314
has all the locks in the system. And,
you're in some threaded program and you

30
00:02:51,314 --> 00:02:57,289
have to access those locks a lot. Well, if
you look at that, that's all going to be

31
00:02:57,289 --> 00:03:02,164
down here. It's going to be sort of low
order addresses. It might be from sort of

32
00:03:02,164 --> 00:03:07,230
here down, whatever your page size is will
say. So even, even if you're not having

33
00:03:07,230 --> 00:03:13,384
false sharing, anything like that. You
typically would try to pack all the data

34
00:03:13,384 --> 00:03:17,733
onto a page or a structure or something
like that, and it's pretty hard to

35
00:03:17,733 --> 00:03:22,200
interleave it based on the very high order
bits of your, of your address. And

36
00:03:22,200 --> 00:03:26,784
especially considering a program has
effectively no control of the high order

37
00:03:26,784 --> 00:03:33,072
bits of a physical address, that's managed
by the OS. So if you do this, one node can

38
00:03:33,072 --> 00:03:37,640
become a hot spot, because these are all
alias to the same directory. So all, all

39
00:03:37,640 --> 00:03:42,381
the messaging traffic goes to one node and
this almost starts to turn back into a

40
00:03:42,381 --> 00:03:46,544
bus. Now, we have one directory, all
traffic has to go there. It's a little

41
00:03:46,544 --> 00:03:51,054
better cause we don't necessarily need to
invalid all other locations, but the

42
00:03:51,054 --> 00:03:57,168
directory and the bandwidth in the
directory starts to become critical. Hm,

43
00:03:57,168 --> 00:04:04,590
well that's a tough one. The flip side is
you can start to try to have the low order

44
00:04:04,590 --> 00:04:11,001
bits determine where your directory is, or
which home node you're using. So, you

45
00:04:11,001 --> 00:04:17,418
still have the, the offset within a cache
line. But then you have the number the,

46
00:04:17,418 --> 00:04:21,554
the bits of the physical address that can
determine what home node your going to be

47
00:04:21,554 --> 00:04:26,768
the low order bits. Well, this ends up
being very well load balanced, because

48
00:04:26,768 --> 00:04:31,616
you'd choose different home nodes
effectively atrandom depending on which

49
00:04:31,616 --> 00:04:37,475
cache line it is. So you know two cache
lines will same cache line will go to the

50
00:04:37,475 --> 00:04:43,266
same home node or the same directory. But
if you have certain different cache lines

51
00:04:43,266 --> 00:04:48,989
in which it is pretty common because it is
pretty hard to content all unwanted cache

52
00:04:48,989 --> 00:04:55,325
lines. This is up much data in one cache
line. You'll spread across the different

53
00:04:55,325 --> 00:05:01,243
controllers and you'll effectively have
some good distribution. Flip though is the

54
00:05:01,550 --> 00:05:08,675
OS losses placement ability here. So it's,
it's tricky, it's a tricky trade-off here

55
00:05:08,675 --> 00:05:13,635
to think about. some people have even
built systems where it's configurable.

56
00:05:13,826 --> 00:05:19,040
this gets a little more advanced. And I
touched on this in the last slide of, of

57
00:05:19,040 --> 00:05:24,508
today's lecture. But, you could think
about having some systems where, depending

58
00:05:24,508 --> 00:05:29,340
on the actual address and depending what
comes out of your, page table. Maybe

59
00:05:29,340 --> 00:05:34,031
making different, choices of how to do the
mapping. But everyone has to agree on the

60
00:05:34,031 --> 00:05:37,557
mapping. Which gets a little bit tricky
cuz the directory has to agree on the

61
00:05:37,557 --> 00:05:40,900
mapping. And all of the caches in the
system have to, agree on the mapping.

62
00:05:42,100 --> 00:05:45,914
Okay, so let's take a look at what is
inside of a directory. So we added this

63
00:05:45,914 --> 00:05:49,829
new hardware structure, and whenever we
add a new hardware structure I like to

64
00:05:49,829 --> 00:05:54,938
look at all the bits inside of the
hardware structure. So we add a new arbor

65
00:05:54,938 --> 00:06:00,315
structure, and this arbor structure has an
entry per cache line in, in that

66
00:06:00,315 --> 00:06:05,355
particular memory connected to the
directory. So if you were to look across

67
00:06:05,355 --> 00:06:10,799
the entire system, there will actually be
an extra piece of data for every single

68
00:06:10,799 --> 00:06:15,882
cache line in the system. And the naive
approach to this will habit such that

69
00:06:15,882 --> 00:06:20,724
every single cache line in the system,
whether it's. Sorry, not every single

70
00:06:20,724 --> 00:06:25,959
cache line, every single memory line in
the system. So if you've ten terabytes of

71
00:06:25,959 --> 00:06:31,129
memory in the system, the naive approach
is going to have a directory entry for

72
00:06:31,129 --> 00:06:37,738
every single block size chunk of memory, a
cache box size chunk of memory in the

73
00:06:37,738 --> 00:06:43,444
system. And these are held in big tables,
typically they're held in SRAM. You might

74
00:06:43,444 --> 00:06:49,545
try to put them in DRAM. And what do we
have here well the directory needs to know

75
00:06:49,545 --> 00:06:55,473
what state the cache line is in and we're
going to look at three different states in

76
00:06:55,473 --> 00:07:01,772
our basic protocol here shared, uncached,
and exclusive. So everything starts out as

77
00:07:01,772 --> 00:07:09,909
uncache d it's out in main memory. When it
gets pulled into a cache's read only, the

78
00:07:09,909 --> 00:07:18,689
directory is going to okay that is now
shared. If it gets pulled into a cache

79
00:07:18,689 --> 00:07:26,097
read/write, the directory is going to note
that as exclusive. Now, if it's in shared

80
00:07:26,097 --> 00:07:32,929
or exclusive, we need to know what node,
well if it's exclusively, you know,

81
00:07:32,929 --> 00:07:37,472
uniquely what node has that? So we can go
message it when we need, need to go

82
00:07:37,472 --> 00:07:42,313
invalidate it. And if it's shared, we need
to know the list of all possible places

83
00:07:42,313 --> 00:07:47,084
that it could be, that we're going to have
to send messages to. And this is better

84
00:07:47,084 --> 00:07:51,541
then having to broadcast or send messages
to all the nodes in the system. So we're

85
00:07:51,541 --> 00:07:57,167
going to have what's called a sharer list
here. Which is a, in a naive full map

86
00:07:57,167 --> 00:08:03,427
directory is going to have one bit per
core in the system, or per cache in the

87
00:08:03,427 --> 00:08:09,927
system. And it's either just going to have
one or a zero in it. So if it's a one that

88
00:08:09,927 --> 00:08:16,427
means that core has a share or read only
copy of the data. And when some other

89
00:08:16,427 --> 00:08:23,248
cache goes to get it in writable in its
cache it's going to have to invalidate,

90
00:08:23,248 --> 00:08:34,313
let's say this one or zero with core's
cache. Now if you're exclusive, your not

91
00:08:34,313 --> 00:08:39,104
going to have multiple bit set here. Cause
this basically means that, that core has a

92
00:08:39,104 --> 00:08:43,837
writable copy and we can't have if we want
to keep the data coherent we won't want

93
00:08:43,837 --> 00:08:48,690
multiple, we don't want multiple copy
writings in the system. So as you can see

94
00:08:48,690 --> 00:08:53,486
here, denoted only one, one here. And if
it's uncached, we don't need to track

95
00:08:53,486 --> 00:09:00,804
anything there, we just got, don't cares.
There's one other state here that I, I

96
00:09:00,804 --> 00:09:06,441
have and it's pending. And this usually
actually turns into a couple sub-states

97
00:09:06,626 --> 00:09:12,038
there's different ways to track this. At
the directory, these transactions take

98
00:09:12,038 --> 00:09:17,040
multiple steps. You're going to send some
data and start transitioning. Let's say,

99
00:09:17,040 --> 00:09:21,327
you want to get a data, data writable.
Well it, that, the directory's going to

100
00:09:21,717 --> 00:09:26,914
have to invalidate all the other copies.
It can't do this instantaneously, but we

101
00:09:26,914 --> 00:09:32,175
want to provide the appearance of a
atomicity or, or, or that the operations

102
00:09:32,175 --> 00:09:37,415
are atomic in some way. So t ypically,
you'll actually have some sub-states that

103
00:09:37,415 --> 00:09:42,728
are shared, that are stored here, which
are something like, oh this cash line is

104
00:09:42,728 --> 00:09:47,631
currently transitioning from, I don't
know, U to E. Don't allow some other

105
00:09:47,631 --> 00:09:53,310
transaction to happen to it right now.
Just kind of block that. Another way to do

106
00:09:53,310 --> 00:09:58,810
that, the one way is to store it actually
in the directory, as a state bit. Another

107
00:09:58,810 --> 00:10:03,853
way is you have some fully associative
structure, a side structure, which just

108
00:10:03,853 --> 00:10:09,601
has all of the cache lines currently in
flux. And, the directory's smart enough to

109
00:10:09,601 --> 00:10:15,208
know that if some other request comes in
for that line, while it's in flux just to

110
00:10:15,414 --> 00:10:21,089
NACK that request, or negative acknowledge
that request and tell the other cache to

111
00:10:21,089 --> 00:10:26,001
retry. So you can do it either way. but it
gets pretty complicated. We're not going

112
00:10:26,001 --> 00:10:30,005
to talk about all the details of that but
we'll talk about the high level

113
00:10:30,168 --> 00:10:37,864
transitions assuming that they are somehow
topic. So here we're going to look at how

114
00:10:37,864 --> 00:10:44,916
MSI. It fits together with this. But you
could actually think about doing this with

115
00:10:44,916 --> 00:10:50,301
Mesi or some other protocol. It's a little
bit simpler, emphasize a little bit

116
00:10:50,301 --> 00:10:55,756
simpler so we're going to look at that.
Also the benefit of something like a Mesi

117
00:10:55,756 --> 00:11:01,071
protocol is lessened in a directory
because if you pull something in, in the

118
00:11:01,071 --> 00:11:07,117
exclusive state, which is unmodified at
the beginning. And someone else wants to

119
00:11:07,117 --> 00:11:11,458
get a read only copy. You're basically
going to have to send a message to that

120
00:11:11,458 --> 00:11:16,147
core. And that was inexpensive on a bus,
because it could just see the transaction

121
00:11:16,147 --> 00:11:20,777
going across. And it would just snoop it
and would demote from E to shared or

122
00:11:20,777 --> 00:11:25,060
something like that, E to S. But now, it
actually turns into actual work. The

123
00:11:25,060 --> 00:11:29,344
directory's going to have to generate
messages. And you're going to have to wait

124
00:11:29,344 --> 00:11:34,565
for responses coming back from a cache
which had it in exclusive, so. full mezies

125
00:11:34,565 --> 00:11:39,644
a little bit less common when you stretch
grow these distributed shared memory

126
00:11:39,837 --> 00:11:48,786
protocols. Okay, so this is a slide we had
before. This is MSI on a bus. Well things

127
00:11:48,786 --> 00:11:53,699
change a little bit when we go to MSI for
directory coherence. And before we go

128
00:11:53,699 --> 00:11:58,549
through this, I wanted to point out, that
there is actually two different state

129
00:11:58,549 --> 00:12:05,060
machines going on here. There's one state
machine that is happening in the cache

130
00:12:05,060 --> 00:12:10,466
controllers, so actually, in the cache of
a respective processor. And then there's a

131
00:12:10,466 --> 00:12:16,139
different state machine which is happening
in the directory. And you'll see that they

132
00:12:16,139 --> 00:12:21,345
have different letters here. This is SU
and E versus MS and I. And, and we label

133
00:12:21,345 --> 00:12:26,818
these differently on purpose just to, not,
not get totally confused. And these state

134
00:12:26,818 --> 00:12:32,024
machines interact by sending messages
between each other, and as messages flow

135
00:12:32,024 --> 00:12:38,559
between the directory and the cache. There
will be both going through different state

136
00:12:38,559 --> 00:12:47,216
transitions on this, on this two tables.
Okay, so let's, let's jump into this. This

137
00:12:47,216 --> 00:12:53,022
is the same modified, shared and invalid
states that we have in our bus space

138
00:12:53,022 --> 00:12:58,677
snoopy and aside protocol. We didn't
change anything here. And the rules, the

139
00:12:58,677 --> 00:13:04,859
rules the same. If you haven't modified,
you can do a right to this and not to send

140
00:13:04,859 --> 00:13:11,516
any messages. If you have a shared, you
can read the data and not have to contact

141
00:13:11,516 --> 00:13:16,192
anybody. If you have an invalid, and you
want to do anything with it, you probably

142
00:13:16,192 --> 00:13:21,048
need to contact somebody. Or you probably
need to contact the directory. Before, we

143
00:13:21,048 --> 00:13:26,144
would have to send the transaction on the
bus. Likewise, the transition from S to M

144
00:13:26,144 --> 00:13:31,060
or M to S where you, used to communicate
it was the same. So think about this as

145
00:13:31,060 --> 00:13:36,599
the same state machine running, except
running on a bus where before we would

146
00:13:36,599 --> 00:13:40,789
send transactions across the bus. Now
we're going to take those transactions and

147
00:13:40,789 --> 00:13:45,304
turn them into messages that we send to
the directory and messages that we receive

148
00:13:45,304 --> 00:13:49,548
from the directory that we have to respond
to. So before when we were snooping

149
00:13:49,548 --> 00:13:54,009
traffic crossed the bus which caused us to
transition different locations. So here

150
00:13:54,009 --> 00:13:58,144
other processor has intent to write and we
saw that across the bus. So we had to

151
00:13:58,144 --> 00:14:02,914
transition ourselves to the invalid state.
Now, we're actually going to get a message

152
00:14:02,914 --> 00:14:07,818
from the directory controller. So let's,
let's walk through this. But it's, it's

153
00:14:07,818 --> 00:14:12,850
almost exactly the same as what we saw
before. So this is the, the cache date for

154
00:14:12,850 --> 00:14:23,210
a particular line for processor P1. we'll
start with the entry points. We start off

155
00:14:23,210 --> 00:14:29,990
an invalid and let's say we want to get a
read, a readable copy of this line. So

156
00:14:29,990 --> 00:14:35,648
we're going to take a read miss. So what
we're going to do is plus serve one is

157
00:14:35,648 --> 00:14:40,633
actually going to send a read miss message
through the directory controller. And

158
00:14:40,633 --> 00:14:45,493
during that time, it does not have a
readable copy. It cannot go and access the

159
00:14:45,493 --> 00:14:51,274
data. It's, it's a, it's effectively still
in the I state. Sometimes people will

160
00:14:51,274 --> 00:14:55,451
actually have sort of a pending state here
depending on how you go to implement this

161
00:14:55,598 --> 00:14:59,726
depends if you have a side structure sort
or something like a mishandling registrar

162
00:14:59,726 --> 00:15:03,560
where you'll track that in. Or you can
track that in the, the cache data itself.

163
00:15:05,026 --> 00:15:11,530
So you're going to read miss. You send the
read miss message, and you're waiting for

164
00:15:11,530 --> 00:15:18,033
a response. This response is going to have
the data that you need. And, it's going to

165
00:15:18,033 --> 00:15:24,698
be synchronization points saying, okay
you're safe to transition to S. Okay that

166
00:15:24,698 --> 00:15:29,550
seems pretty simple. Similar sort of thing
here for write miss if we're in the in

167
00:15:29,550 --> 00:15:34,522
invalid state and we do a write we're
going to send a write miss request to the

168
00:15:34,522 --> 00:15:39,187
directory controller. It's going to do
something and it may we may have to be

169
00:15:39,187 --> 00:15:43,419
waiting for awhile here cause it may have
to go invalidate all of the other lines in

170
00:15:43,419 --> 00:15:47,899
the system. And then it gets a response
and once it gets a response we have a data

171
00:15:47,899 --> 00:15:52,842
that we can transition to the modified
state. So as we said, these arcs are

172
00:15:52,842 --> 00:15:59,161
pretty easy you can read by P1 and nothing
changes or we can read or write from the M

173
00:15:59,161 --> 00:16:05,405
state by P1 and we also communicate with
anybody. But now we have a few different

174
00:16:05,405 --> 00:16:12,273
messages coming in here. If we're in the
shared state, we have to be responsive to

175
00:16:12,273 --> 00:16:19,446
an invalidation message. Which is a little
bit different than a bus snoop. So before,

176
00:16:19,446 --> 00:16:25,398
we saw another processor trying to write.
That's when transition goes to I, but now

177
00:16:25,398 --> 00:16:30,623
the directory controller sends us a
message which says, invalidate this line

178
00:16:30,623 --> 00:16:35,734
and that will transition us to I here.
Note, there will probably be a reply. We

179
00:16:35,734 --> 00:16:40,357
will probably have to send a reply,
because the director controller wants to

180
00:16:40,357 --> 00:16:45,184
know. When all of the cache lines in the
system have been validated and it may take

181
00:16:45,184 --> 00:16:49,382
a variable amount of time and its sending
messages so it wants to wait for a reply

182
00:16:49,382 --> 00:16:55,306
to come back so we're going to have to
send a reply. So this arc here is similar.

183
00:16:55,306 --> 00:17:00,783
Except, we need to write back data, cause
we had modified data. We had writable

184
00:17:00,783 --> 00:17:06,260
data. We get an invalidate message from
the directory controller. So we need to

185
00:17:06,260 --> 00:17:11,527
write back the data, and then reply
afterwards. Similar, similar sort of idea

186
00:17:11,527 --> 00:17:18,116
here. Okay, so that leaves two arcs left
here in the middle. We're in shared, and

187
00:17:18,116 --> 00:17:23,611
we want to do a right to a, to that cache
line. So, our cache we have in the shared

188
00:17:23,611 --> 00:17:30,629
state. We want to do a write to it. Before
we can actually do a write we have to send

189
00:17:30,629 --> 00:17:36,388
a message to the directory saying, I'm
doing a write miss here. I want to get

190
00:17:36,388 --> 00:17:43,200
this data writable. And we have to wait
for a reply before we transition here.

191
00:17:44,151 --> 00:17:49,103
because we have to wait for the directory
contror to communicate with all the other

192
00:17:49,103 --> 00:17:55,732
cache's so that they don't have redoing
copies and we can have a writable copy. So

193
00:17:55,732 --> 00:18:00,337
it's going to invalidate all the other
readable copies in the meantime. And then

194
00:18:00,337 --> 00:18:05,842
finally, we have an edge coming this way
which is from modified down to shared. And

195
00:18:05,842 --> 00:18:11,288
this is a little bit different. well, it's
the same idea here. Another processor is

196
00:18:11,288 --> 00:18:17,075
tying to do a read. So we have in a
modified state when another processor

197
00:18:17,075 --> 00:18:22,378
tries to do a read. So we receive a read
miss message. We don't need to invalidate

198
00:18:22,378 --> 00:18:27,354
the data, but we need to write back the
data. Cuz we have the most up to date

199
00:18:27,354 --> 00:18:32,265
copy, cuz we had it modified. So we're
going to go into write back the data and

200
00:18:32,265 --> 00:18:36,848
that's going to be response, and then
we're going to transition to share and

201
00:18:36,848 --> 00:18:42,360
state. We can keep a read copy of this,
because the other, the other core is, is,

202
00:18:42,360 --> 00:18:49,484
is only having a, a readable copy of it
also. Okay, so that's the. Any questions

203
00:18:49,484 --> 00:18:58,984
about that so far? Okay, so two
interesting arcs that we're going to add

204
00:18:58,984 --> 00:19:12,245
in here is this one and this one. Which we
didn't have in our base MSI protocol. And

205
00:19:12,245 --> 00:19:20,051
you know, you may not need these. But what
these correspond to is, if our cache has

206
00:19:20,051 --> 00:19:27,858
the data in it and then because of let's
say a conflict miss, or capacity miss it

207
00:19:27,858 --> 00:19:35,101
gets bumped out. It might be a good idea
to go update the directory, and tell the

208
00:19:35,101 --> 00:19:40,218
directory that in the future if some other
cache wants to go get that data, that it

209
00:19:40,218 --> 00:19:46,408
doesn't need to go contact you again. So,
if it's in the modified state we can write

210
00:19:46,408 --> 00:19:51,289
back the data because we have dirty data
we write back that the directory and then

211
00:19:51,289 --> 00:19:55,410
we notify the directory saying we don't
have a copy of this anymore you can

212
00:19:55,410 --> 00:20:00,837
transition to having it uncached. Likewise
here, if we have a read-only copy we may

213
00:20:00,837 --> 00:20:04,866
or may not want to do this. If we, if
there's, you know, extra bandwidth on the,

214
00:20:04,866 --> 00:20:09,318
on the interconnect we might want to send
a message when we do an invalidation here.

215
00:20:09,318 --> 00:20:13,665
And this is not an invalidation because of
an invalidation message, but this is an

216
00:20:13,665 --> 00:20:18,422
invalidation, because it just gets bumped
out of the cache. We may want to notify

217
00:20:18,422 --> 00:20:24,511
the directory saying please remove us from
the sharer list. And if the sharer list is

218
00:20:24,511 --> 00:20:29,907
already empty, the, the directory might
change the cache line from shared to being

219
00:20:29,907 --> 00:20:35,966
uncached completely. but I do want to
point out that these are not strictly

220
00:20:35,966 --> 00:20:41,359
necessary. The reason they're not strictly
necessary is, if we build the cache

221
00:20:41,359 --> 00:20:45,703
controller system such that if you're in
the invalid state for a particular cached

222
00:20:45,703 --> 00:20:49,779
line, and you get some message coming in
that would have been let's say this

223
00:20:49,779 --> 00:20:54,070
message, or that message, or some other
arc. We can just reply back saying yeah,

224
00:20:54,070 --> 00:20:58,521
we don't have it anymore. We're invalid,
you know, we don't really care about that,

225
00:20:58,521 --> 00:21:03,833
that transition. So if you were, you were
here, the only message that's going to

226
00:21:03,833 --> 00:21:08,003
come really to you is an invalidation
message that would just take you to this

227
00:21:08,003 --> 00:21:12,015
state anyway. So, we can just ignore the
message or just reply the s ame as we

228
00:21:12,015 --> 00:21:20,951
would to the normal invalidation message.
Okay, so directory state transition looks

229
00:21:20,951 --> 00:21:33,545
a little different here. We have uncached,
shared, and exclusive. As we said, shared

230
00:21:33,545 --> 00:21:38,637
means there can be multiple read-only
copies in the system. Exclusive means

231
00:21:38,637 --> 00:21:44,205
there's only one cache in the system with
that data. What's interesting here is if

232
00:21:44,205 --> 00:21:49,962
you were to actually have a MESI protocol
running, that would not change the

233
00:21:49,962 --> 00:21:56,264
protocol running in the directory. Because
exclusive here is effectively the same,

234
00:21:56,264 --> 00:22:02,365
same state, with respect to how the
directory sees the line you won't have to

235
00:22:02,365 --> 00:22:07,328
do anything different. Okay, so let's walk
through a few transition here of the state

236
00:22:07,328 --> 00:22:13,162
of the cache line in the directory and
this is not in the cache. Let's start off

237
00:22:13,162 --> 00:22:18,193
uncashed and let's say we're getting a
message which is a read miss from

238
00:22:18,193 --> 00:22:24,704
processor P. Well, we should transition to
S now. We should give it a readable copy

239
00:22:24,704 --> 00:22:30,123
and we should reply with the actual data
and we should put P on the sharer list, so

240
00:22:30,123 --> 00:22:35,607
that we know that if someone else needs to
go invalidate that line we need to go

241
00:22:35,607 --> 00:22:40,577
contact P. Now that we're in the shared
state, let's say there's other read misses

242
00:22:40,577 --> 00:22:44,879
from other P's other processors here. Well
were going to give it up the data and

243
00:22:44,879 --> 00:22:49,942
we're going to add it to the sharer list
so we're take sharers and add to it. The

244
00:22:49,942 --> 00:22:55,773
processor the sharer list is just going to
grow. Okay lets, lets start here and go

245
00:22:55,773 --> 00:23:02,042
the other way where an un uncached in all
the sun in we get a rightness from proster

246
00:23:02,042 --> 00:23:07,720
P. Well we give it the data and the sharer
list or the owner is going to get P

247
00:23:07,720 --> 00:23:13,841
uniquely on to it, ever going to give it
in these causes day because we're on cache

248
00:23:13,841 --> 00:23:23,006
reform. We don't want to contact anybody
else. let's look at this art here before

249
00:23:23,006 --> 00:23:28,902
we go to these. So this is a little bit
different. Quite a bit different than what

250
00:23:28,902 --> 00:23:34,944
we had in these slides, because it's doing
something different. But in this state

251
00:23:34,944 --> 00:23:45,318
here, we know, let's say, processor P zero
has the data exclusively. But all of a

252
00:23:45,318 --> 00:23:50,278
sudden, a different processor, let's say
processor two goes to access the da ta.

253
00:23:50,282 --> 00:23:55,627
Well, we already have the data in the
exclusive state. So we're going to stay in

254
00:23:55,627 --> 00:23:59,655
this exclusive state cuz some other caches
going to want to get it exclusive, but

255
00:23:59,655 --> 00:24:05,315
it's different cache. So what has to
happen here is we need to go invalidate

256
00:24:05,315 --> 00:24:10,835
the data out of P zero. P zero is going to
write back the data, it's going to

257
00:24:10,835 --> 00:24:19,581
transition to the invalid state. The, we
need to then provide the data to the new

258
00:24:19,581 --> 00:24:27,814
processor P2 we'll say and add that P2 to
the sharer list. So we can, we can

259
00:24:27,814 --> 00:24:33,219
transition to this state and then finally
let's look at the edges between these two

260
00:24:33,219 --> 00:24:40,531
points oh, actually let's go this way
first. if you've data that gets ridden

261
00:24:40,531 --> 00:24:46,631
back. so this is that arc, which I said is
similar to the arc here, which is

262
00:24:46,631 --> 00:24:53,659
optional. Let's say you have data that
gets right, ridden back here. Actually

263
00:24:53,659 --> 00:24:58,570
this, this arc may not be optional, let's
think about that for a second. This arc

264
00:24:58,570 --> 00:25:04,932
may not be optional. no it's still
optional. cuz you can just NACK the

265
00:25:04,932 --> 00:25:10,273
message effectively, and, and tell it it's
in main memory. okay, so let's hear, and

266
00:25:10,273 --> 00:25:15,000
you see a data write back happening. So,
message gets sent to you which is the

267
00:25:15,000 --> 00:25:19,849
equivalent of this arg here. The data was
writeable, was exclusive to some cache,

268
00:25:19,849 --> 00:25:24,331
and it's no longer writeable. It's
probably a good idea to go contact the

269
00:25:24,331 --> 00:25:29,303
directory, write back the data, and clear
the sharer list. The sharer list is empty,

270
00:25:29,303 --> 00:25:36,925
so it knows that no one has a copy of it,
at that point. Okay few other financials

271
00:25:36,925 --> 00:25:44,361
here, okay we are in the shared state. So
we have multiple read-only copies. And one

272
00:25:44,361 --> 00:25:50,546
cache comes along and says,"Oh, I need to
do a writeness message." I need to get a

273
00:25:50,546 --> 00:25:56,227
writtable. Well, now we actually have to
go through a pretty long process. We're

274
00:25:56,227 --> 00:26:00,372
going to walk through the entire sharer
list and send messages to all the sharers

275
00:26:00,372 --> 00:26:05,253
in the sharer list saying, invalidate this
copy and tell me when you're done. We're

276
00:26:05,253 --> 00:26:09,355
going to collect all the responses at the
directory. And once all the responses have

277
00:26:09,355 --> 00:26:15,231
come back, we know no one else has
readable copy. We can give the data value

278
00:26:15,231 --> 00:26:26,648
to the requester. And add it to the sharer
or owner list. Okay, last arc here is from

279
00:26:26,648 --> 00:26:32,666
E to S. This orange arc and that happens
if we have a particular line as writable

280
00:26:32,666 --> 00:26:37,918
in one cash, and another cash wants to go
read it now. Will send a read miss the

281
00:26:37,918 --> 00:26:44,218
other cache is going to downgrade from E
to S, excuse me from M to S in its vocal

282
00:26:44,218 --> 00:26:49,527
cache. But the directory is going to
transition from E to S here and we have to

283
00:26:49,527 --> 00:26:55,401
go get the most up to date from the node.
So, we're going to send a fetches and a

284
00:26:55,401 --> 00:27:00,993
fetch request to the node that had it
before and exclusive, once you get the up

285
00:27:00,993 --> 00:27:06,656
to most up to date data you can forward
that to the new reader and everyone and,

286
00:27:06,656 --> 00:27:13,796
and we add their processor to the sharer
list. Okay, so questions about that one so

287
00:27:13,796 --> 00:27:18,555
far? These, these do start to get a little
complicated because you have multiple

288
00:27:18,555 --> 00:27:27,586
state machines interacting. Okay, so were
going to speed up a little bit here. I

289
00:27:27,586 --> 00:27:32,267
include this chart from your book just to
give you an example of. We went through,

290
00:27:32,267 --> 00:27:36,716
very quickly here, all the different
messages. And, this chart here sums up all

291
00:27:36,716 --> 00:27:41,454
the different message types. And from who
they could go from and who they could go

292
00:27:41,454 --> 00:27:46,423
to. And this is, this is in your textbook.
and sometimes messages need to communicate

293
00:27:46,423 --> 00:27:50,641
addresses. Sometimes they need to
communicate data. Sometimes they need to

294
00:27:50,641 --> 00:27:55,091
communicate which node the message is
coming from. To add it to the, the sharer

295
00:27:55,091 --> 00:27:59,616
list. But I'm not going to go through this
into, to great detail. One think I did

296
00:27:59,616 --> 00:28:06,179
want to say is, these message types here,
do not include . So, when you go to

297
00:28:06,179 --> 00:28:16,888
request something, there's replies that
come back. These replies after, that's not

298
00:28:16,888 --> 00:28:23,887
drawn in this diagram. We, we see data
value reply but that's not, that's just

299
00:28:23,887 --> 00:28:28,994
what of, actual data. There's not like a,
response coming back from the sharer

300
00:28:28,994 --> 00:28:33,583
acking the, the sharer, or acking the
invalidator or something like that.

301
00:28:33,583 --> 00:28:38,560
Another type of message that is pretty
common, that is not drawn here is a

302
00:28:38,560 --> 00:28:43,731
negative acknowledgement. So it's pretty
common if you have a cache line that is

303
00:28:43,731 --> 00:28:48,256
being transitioned, it's in a pending
state, at the directory , and get a

304
00:28:48,256 --> 00:28:55,123
request coming in. You might need to tell
that cach retry later. I can't handle this

305
00:28:55,123 --> 00:28:56,180
case later right now.