1
00:00:03,932 --> 00:00:11,804
Okay. So that finishes what I was talking
about, . And now we get to move onto a, a

2
00:00:11,804 --> 00:00:17,860
fun subject. And how to go about
implementing shared memory in a small

3
00:00:17,860 --> 00:00:25,993
symmetric multiprocessor. So to recap, the
multiprocessor has processors and memory.

4
00:00:25,993 --> 00:00:32,481
And all the processors are equidistant
away from the memory. Hence, the name,

5
00:00:32,481 --> 00:00:40,762
symmetric. you also have . And, . The
things here are fists, and graphics, and

6
00:00:40,762 --> 00:00:48,464
networks. And these also end up living on
your or your processor bus. Now, lets take

7
00:00:48,464 --> 00:00:54,854
a look at one of these processor buses.
And see, roughly, what, what one of these

8
00:00:54,854 --> 00:01:00,917
things looks like. So here, we have two
processors and some memory. It's all

9
00:01:00,917 --> 00:01:07,307
hanging off of a bus. And when I say a
bus, I actually mean a multi-prop bus. So

10
00:01:07,307 --> 00:01:13,321
any of these entities can drive different
wires. On this bus. Now, of course, you

11
00:01:13,321 --> 00:01:18,803
need to be careful here to not have
multiple entities driving the wires at the

12
00:01:18,803 --> 00:01:24,354
same time. And hence, you need some sort
of arbitration to prevent that. You could

13
00:01:24,354 --> 00:01:29,835
have a, pull down style bus. Where
multiple people can be trying to drive the

14
00:01:29,835 --> 00:01:36,497
bus at the same time. And nothing will set
on fire. Because there's basically, some

15
00:01:36,497 --> 00:01:42,325
sort of resistor which pulls the bus up to
a certain value. And then, when the entity

16
00:01:42,325 --> 00:01:47,401
wants to not assert anything onto the bus.
It just floats its output. Or when it

17
00:01:47,401 --> 00:01:52,250
wants to drive something on the bus, it'll
actually pull it down to value inside. So

18
00:01:52,250 --> 00:01:56,923
that'll be a zero pulled down and if it
doesn't drive anything, you just pull it

19
00:01:56,923 --> 00:02:01,071
out, point to y. But of course you can
have multi, multi-drop buses. And, I

20
00:02:01,071 --> 00:02:05,628
wanted to sorta talk a little bit about
arbitration for a bus. Cuz even if you

21
00:02:05,628 --> 00:02:10,886
have a multi-drop bus you still have
multiple people screaming on this bus at

22
00:02:10,886 --> 00:02:15,676
the same time. It's a shared medium. It's
like if everyone in this room were trying

23
00:02:15,676 --> 00:02:21,298
to scream at the same time. Trying to
communicate. . Figure out what's going on.

24
00:02:21,298 --> 00:02:27,854
And this is in contrast to sort of point
to point systems. which we will be talking

25
00:02:27,854 --> 00:02:34,029
about in a few lectures. Where we'll be
talking about other coherent systems that

26
00:02:34,029 --> 00:02:40,051
you can implement over switch networks.
But for right now let's assume that you

27
00:02:40,051 --> 00:02:46,074
just have a set of wires. Everyone can
basically drive these wires. Everyone can,

28
00:02:46,302 --> 00:02:52,096
receive from these wires. And usually,
arbitration actually is done as a pull

29
00:02:52,096 --> 00:02:59,658
down style bus. So let's, let's think real
hard, a couple ways how to do this. How,

30
00:02:59,658 --> 00:03:08,484
how do you do arbitration? people, once
you scream at the same time and say, I

31
00:03:08,484 --> 00:03:16,508
want the bus. Anyone have any thoughts?
So, , one way to do this, and this is

32
00:03:16,508 --> 00:03:25,033
actually relatively common for
arbitration, is you have . And, , let's

33
00:03:25,033 --> 00:03:38,923
say there are, . Your three request wires.
And then three grant wires. So, this is a,

34
00:03:38,923 --> 00:03:48,900
a valid way of going, doing this, and
actually this is not too uncommon. As to

35
00:03:48,900 --> 00:03:59,008
how it should, which has however many
requesters coming in, someone, search the

36
00:03:59,008 --> 00:04:07,258
request wire and. And a signal comes back
from the arbitrator check. And, In fact,

37
00:04:07,258 --> 00:04:14,449
in something like the PCIE or . There's a,
there's a, something that's very similar

38
00:04:14,449 --> 00:04:20,780
to this. There's a PCI post controller.
You, each device that's plugged into the

39
00:04:20,780 --> 00:04:28,440
multi drop bus pulls the wire down request
a bus. And within one cycle, the

40
00:04:28,440 --> 00:04:35,240
arbitrator makes a decision. And if you
have priority inside there, it could .

41
00:04:35,240 --> 00:04:40,713
There's different ways to go about doing
this. Today we'll assert one of the grant

42
00:04:40,713 --> 00:04:46,122
wires coming back and that's who wins the
arbitration. So that's one way to go about

43
00:04:46,122 --> 00:04:51,274
doing this. Another way is actually to
have a, let's say you have three different

44
00:04:51,467 --> 00:04:56,618
entities on this bus, you view it in a
more distributed fashion. You could have a

45
00:04:56,618 --> 00:05:01,705
multi-drop bus. And if you have three
entities, you actually have three wires on

46
00:05:01,705 --> 00:05:06,793
your arbitration bus. And when someone
wants to use the bus, they just pull down.

47
00:05:06,793 --> 00:05:12,062
And there's, there's pull up resistors.
What happens now is everyone can see who's

48
00:05:12,062 --> 00:05:17,182
requesting at the same time and if you
have a fixed priority, let's say if all

49
00:05:17,182 --> 00:05:22,761
three wires are pulled down then question
one always wins or the lower number of one

50
00:05:22,761 --> 00:05:27,946
always wins. You could do something like
that, so you can do it without having a

51
00:05:27,946 --> 00:05:33,394
specific active entity you can dis tribute
fashion. So that's actually pretty common

52
00:05:33,394 --> 00:05:38,645
in some of these clusters and, and but you
also see this where there is operator

53
00:05:38,645 --> 00:05:43,240
check if you want to do handsier
operation. Okay so let's split up our.

54
00:05:43,240 --> 00:05:48,306
From control here. So control is going to
be. I'm doing a load, or I'm doing a

55
00:05:48,306 --> 00:05:54,306
store. it's like a request. It's, it's ,
probably not actually, I'm doing a load

56
00:05:54,306 --> 00:05:59,106
and I'm doing a store. It's probably, load
and store are cut into smaller

57
00:05:59,106 --> 00:06:04,973
transactions. Like, as we'll, as we'll see
soon. Our protocols, there might be a

58
00:06:04,973 --> 00:06:10,506
request for a line or something like that.
Or a request to have exclusive access to

59
00:06:11,306 --> 00:06:16,706
data. And that will go across our control.
but it could also just be a load and

60
00:06:16,706 --> 00:06:22,990
restore. .... From the process of memory
is the same. There's addresses, there's

61
00:06:22,990 --> 00:06:29,490
some data, of course you hopefully your
probably not going to talk on your bus.

62
00:06:29,490 --> 00:06:35,906
You can do it when you talk but it's
probably what you don't want to do. Okay,

63
00:06:35,906 --> 00:06:42,990
so the easy contisioner of some multidrop
buses scream at the same time and whatever

64
00:06:42,990 --> 00:06:49,740
you say on the bus everyone else hears.
One of the interesting things is that you

65
00:06:49,740 --> 00:06:55,160
may not want to... The bus, arbitrate the
bus, and then own the bus for a long

66
00:06:55,160 --> 00:07:00,230
period of time, and then release the bus.
Instead, you actually might want to

67
00:07:00,230 --> 00:07:05,772
pipeline these operations. So, such that,
you can be arbitrating for the next use of

68
00:07:05,772 --> 00:07:11,180
the bus while the current use of the bus
is currently still going on. So, to sort

69
00:07:11,180 --> 00:07:16,723
of show this graphically let's say this is
a transaction here for a close that

70
00:07:16,723 --> 00:07:21,927
Processor one is doing. So Processor one
puts on the arbitration bus, saying, I

71
00:07:21,927 --> 00:07:28,877
want to use this bus sometime in the near
future. And it wins. Then, let's say it

72
00:07:28,877 --> 00:07:38,579
actually, the bus is designed so that each
subsequent cycle is, have, the, the next

73
00:07:38,579 --> 00:07:47,585
thing happen. So it's actually these four
cycles . Now. It's probably not this

74
00:07:47,585 --> 00:07:52,078
simple. But I kind of want to get the idea
across here that you can pipeline access

75
00:07:52,078 --> 00:07:56,301
to the bus and be. And, and there's good
reason to do this. Because, for instance,

76
00:07:56,301 --> 00:08:00,741
if you're trying to do a load from memory.
It takes time for the data to come back.

77
00:08:00,741 --> 00:08:04,964
So the other option is you drive the
address. And, you just wait for the memory

78
00:08:04,964 --> 00:08:09,132
to respond to the, to the load data or
something like that. But instead, if you

79
00:08:09,132 --> 00:08:13,572
pipeline it, some other entity can start a
different transaction here. And you have

80
00:08:13,572 --> 00:08:18,120
another transaction happening there on the
address bus. While you're waiting for the

81
00:08:18,120 --> 00:08:21,285
memory to basically look up and get back.
Yeah.

82
00:08:21,286 --> 00:08:28,861
So you'ld overlap nineteen pipeline buses.
one other thing I wanted to say is that

83
00:08:28,861 --> 00:08:35,403
this is a pretty simple pipe like bus
model. You can go to much more advance

84
00:08:35,403 --> 00:08:41,514
things that are called split phase
transaction buses. Where what you'll

85
00:08:41,514 --> 00:08:48,486
actually do is you'll basically arbitrate
for a bus, drive a request onto the bus,

86
00:08:48,486 --> 00:08:55,868
and then the, the outcome of that may take
tens of cycles to release. meantime. And

87
00:08:55,868 --> 00:09:01,005
the entity which has to respond, then,
rearbitrates for the , and gives the

88
00:09:01,005 --> 00:09:06,142
response. So a good example of this is one
processor is trying to loot both the main

89
00:09:06,142 --> 00:09:12,501
memory. very far away. It's going to take
hundreds of cycles to respond. So it the

90
00:09:12,501 --> 00:09:17,088
bus. This load transaction, and the
address. And the transaction was designed

91
00:09:17,088 --> 00:09:22,102
such that the bus, engineer designed it
such that data doesn't come back, in the

92
00:09:22,102 --> 00:09:27,116
load transaction. Sometime in the future,
the memory controller, will arbitrate from

93
00:09:27,116 --> 00:09:33,641
the bus . Well instead of having p1
arbitrate as the memory controller and it

94
00:09:33,641 --> 00:09:38,424
will have something like load response.
And then load response will say maybe it

95
00:09:38,424 --> 00:09:43,726
will the address. Maybe it won't. Probably
has to drive something here so that it

96
00:09:43,726 --> 00:09:48,855
knows, so that the system knows which
response that it is memory transactions

97
00:09:48,855 --> 00:09:53,926
happening. And then finally . So to sum up
here you can actually have multiple

98
00:09:53,926 --> 00:09:59,228
transaction to the bus and have what we
think of as one transaction just a load

99
00:09:59,228 --> 00:10:05,542
from memory actually split into multiple
transactions. They call the split paid

100
00:10:05,542 --> 00:10:07,620
transaction .