1 00:00:03,932 --> 00:00:11,804 Okay. So that finishes what I was talking about, . And now we get to move onto a, a 2 00:00:11,804 --> 00:00:17,860 fun subject. And how to go about implementing shared memory in a small 3 00:00:17,860 --> 00:00:25,993 symmetric multiprocessor. So to recap, the multiprocessor has processors and memory. 4 00:00:25,993 --> 00:00:32,481 And all the processors are equidistant away from the memory. Hence, the name, 5 00:00:32,481 --> 00:00:40,762 symmetric. you also have . And, . The things here are fists, and graphics, and 6 00:00:40,762 --> 00:00:48,464 networks. And these also end up living on your or your processor bus. Now, lets take 7 00:00:48,464 --> 00:00:54,854 a look at one of these processor buses. And see, roughly, what, what one of these 8 00:00:54,854 --> 00:01:00,917 things looks like. So here, we have two processors and some memory. It's all 9 00:01:00,917 --> 00:01:07,307 hanging off of a bus. And when I say a bus, I actually mean a multi-prop bus. So 10 00:01:07,307 --> 00:01:13,321 any of these entities can drive different wires. On this bus. Now, of course, you 11 00:01:13,321 --> 00:01:18,803 need to be careful here to not have multiple entities driving the wires at the 12 00:01:18,803 --> 00:01:24,354 same time. And hence, you need some sort of arbitration to prevent that. You could 13 00:01:24,354 --> 00:01:29,835 have a, pull down style bus. Where multiple people can be trying to drive the 14 00:01:29,835 --> 00:01:36,497 bus at the same time. And nothing will set on fire. Because there's basically, some 15 00:01:36,497 --> 00:01:42,325 sort of resistor which pulls the bus up to a certain value. And then, when the entity 16 00:01:42,325 --> 00:01:47,401 wants to not assert anything onto the bus. It just floats its output. Or when it 17 00:01:47,401 --> 00:01:52,250 wants to drive something on the bus, it'll actually pull it down to value inside. So 18 00:01:52,250 --> 00:01:56,923 that'll be a zero pulled down and if it doesn't drive anything, you just pull it 19 00:01:56,923 --> 00:02:01,071 out, point to y. But of course you can have multi, multi-drop buses. And, I 20 00:02:01,071 --> 00:02:05,628 wanted to sorta talk a little bit about arbitration for a bus. Cuz even if you 21 00:02:05,628 --> 00:02:10,886 have a multi-drop bus you still have multiple people screaming on this bus at 22 00:02:10,886 --> 00:02:15,676 the same time. It's a shared medium. It's like if everyone in this room were trying 23 00:02:15,676 --> 00:02:21,298 to scream at the same time. Trying to communicate. . Figure out what's going on. 24 00:02:21,298 --> 00:02:27,854 And this is in contrast to sort of point to point systems. which we will be talking 25 00:02:27,854 --> 00:02:34,029 about in a few lectures. Where we'll be talking about other coherent systems that 26 00:02:34,029 --> 00:02:40,051 you can implement over switch networks. But for right now let's assume that you 27 00:02:40,051 --> 00:02:46,074 just have a set of wires. Everyone can basically drive these wires. Everyone can, 28 00:02:46,302 --> 00:02:52,096 receive from these wires. And usually, arbitration actually is done as a pull 29 00:02:52,096 --> 00:02:59,658 down style bus. So let's, let's think real hard, a couple ways how to do this. How, 30 00:02:59,658 --> 00:03:08,484 how do you do arbitration? people, once you scream at the same time and say, I 31 00:03:08,484 --> 00:03:16,508 want the bus. Anyone have any thoughts? So, , one way to do this, and this is 32 00:03:16,508 --> 00:03:25,033 actually relatively common for arbitration, is you have . And, , let's 33 00:03:25,033 --> 00:03:38,923 say there are, . Your three request wires. And then three grant wires. So, this is a, 34 00:03:38,923 --> 00:03:48,900 a valid way of going, doing this, and actually this is not too uncommon. As to 35 00:03:48,900 --> 00:03:59,008 how it should, which has however many requesters coming in, someone, search the 36 00:03:59,008 --> 00:04:07,258 request wire and. And a signal comes back from the arbitrator check. And, In fact, 37 00:04:07,258 --> 00:04:14,449 in something like the PCIE or . There's a, there's a, something that's very similar 38 00:04:14,449 --> 00:04:20,780 to this. There's a PCI post controller. You, each device that's plugged into the 39 00:04:20,780 --> 00:04:28,440 multi drop bus pulls the wire down request a bus. And within one cycle, the 40 00:04:28,440 --> 00:04:35,240 arbitrator makes a decision. And if you have priority inside there, it could . 41 00:04:35,240 --> 00:04:40,713 There's different ways to go about doing this. Today we'll assert one of the grant 42 00:04:40,713 --> 00:04:46,122 wires coming back and that's who wins the arbitration. So that's one way to go about 43 00:04:46,122 --> 00:04:51,274 doing this. Another way is actually to have a, let's say you have three different 44 00:04:51,467 --> 00:04:56,618 entities on this bus, you view it in a more distributed fashion. You could have a 45 00:04:56,618 --> 00:05:01,705 multi-drop bus. And if you have three entities, you actually have three wires on 46 00:05:01,705 --> 00:05:06,793 your arbitration bus. And when someone wants to use the bus, they just pull down. 47 00:05:06,793 --> 00:05:12,062 And there's, there's pull up resistors. What happens now is everyone can see who's 48 00:05:12,062 --> 00:05:17,182 requesting at the same time and if you have a fixed priority, let's say if all 49 00:05:17,182 --> 00:05:22,761 three wires are pulled down then question one always wins or the lower number of one 50 00:05:22,761 --> 00:05:27,946 always wins. You could do something like that, so you can do it without having a 51 00:05:27,946 --> 00:05:33,394 specific active entity you can dis tribute fashion. So that's actually pretty common 52 00:05:33,394 --> 00:05:38,645 in some of these clusters and, and but you also see this where there is operator 53 00:05:38,645 --> 00:05:43,240 check if you want to do handsier operation. Okay so let's split up our. 54 00:05:43,240 --> 00:05:48,306 From control here. So control is going to be. I'm doing a load, or I'm doing a 55 00:05:48,306 --> 00:05:54,306 store. it's like a request. It's, it's , probably not actually, I'm doing a load 56 00:05:54,306 --> 00:05:59,106 and I'm doing a store. It's probably, load and store are cut into smaller 57 00:05:59,106 --> 00:06:04,973 transactions. Like, as we'll, as we'll see soon. Our protocols, there might be a 58 00:06:04,973 --> 00:06:10,506 request for a line or something like that. Or a request to have exclusive access to 59 00:06:11,306 --> 00:06:16,706 data. And that will go across our control. but it could also just be a load and 60 00:06:16,706 --> 00:06:22,990 restore. .... From the process of memory is the same. There's addresses, there's 61 00:06:22,990 --> 00:06:29,490 some data, of course you hopefully your probably not going to talk on your bus. 62 00:06:29,490 --> 00:06:35,906 You can do it when you talk but it's probably what you don't want to do. Okay, 63 00:06:35,906 --> 00:06:42,990 so the easy contisioner of some multidrop buses scream at the same time and whatever 64 00:06:42,990 --> 00:06:49,740 you say on the bus everyone else hears. One of the interesting things is that you 65 00:06:49,740 --> 00:06:55,160 may not want to... The bus, arbitrate the bus, and then own the bus for a long 66 00:06:55,160 --> 00:07:00,230 period of time, and then release the bus. Instead, you actually might want to 67 00:07:00,230 --> 00:07:05,772 pipeline these operations. So, such that, you can be arbitrating for the next use of 68 00:07:05,772 --> 00:07:11,180 the bus while the current use of the bus is currently still going on. So, to sort 69 00:07:11,180 --> 00:07:16,723 of show this graphically let's say this is a transaction here for a close that 70 00:07:16,723 --> 00:07:21,927 Processor one is doing. So Processor one puts on the arbitration bus, saying, I 71 00:07:21,927 --> 00:07:28,877 want to use this bus sometime in the near future. And it wins. Then, let's say it 72 00:07:28,877 --> 00:07:38,579 actually, the bus is designed so that each subsequent cycle is, have, the, the next 73 00:07:38,579 --> 00:07:47,585 thing happen. So it's actually these four cycles . Now. It's probably not this 74 00:07:47,585 --> 00:07:52,078 simple. But I kind of want to get the idea across here that you can pipeline access 75 00:07:52,078 --> 00:07:56,301 to the bus and be. And, and there's good reason to do this. Because, for instance, 76 00:07:56,301 --> 00:08:00,741 if you're trying to do a load from memory. It takes time for the data to come back. 77 00:08:00,741 --> 00:08:04,964 So the other option is you drive the address. And, you just wait for the memory 78 00:08:04,964 --> 00:08:09,132 to respond to the, to the load data or something like that. But instead, if you 79 00:08:09,132 --> 00:08:13,572 pipeline it, some other entity can start a different transaction here. And you have 80 00:08:13,572 --> 00:08:18,120 another transaction happening there on the address bus. While you're waiting for the 81 00:08:18,120 --> 00:08:21,285 memory to basically look up and get back. Yeah. 82 00:08:21,286 --> 00:08:28,861 So you'ld overlap nineteen pipeline buses. one other thing I wanted to say is that 83 00:08:28,861 --> 00:08:35,403 this is a pretty simple pipe like bus model. You can go to much more advance 84 00:08:35,403 --> 00:08:41,514 things that are called split phase transaction buses. Where what you'll 85 00:08:41,514 --> 00:08:48,486 actually do is you'll basically arbitrate for a bus, drive a request onto the bus, 86 00:08:48,486 --> 00:08:55,868 and then the, the outcome of that may take tens of cycles to release. meantime. And 87 00:08:55,868 --> 00:09:01,005 the entity which has to respond, then, rearbitrates for the , and gives the 88 00:09:01,005 --> 00:09:06,142 response. So a good example of this is one processor is trying to loot both the main 89 00:09:06,142 --> 00:09:12,501 memory. very far away. It's going to take hundreds of cycles to respond. So it the 90 00:09:12,501 --> 00:09:17,088 bus. This load transaction, and the address. And the transaction was designed 91 00:09:17,088 --> 00:09:22,102 such that the bus, engineer designed it such that data doesn't come back, in the 92 00:09:22,102 --> 00:09:27,116 load transaction. Sometime in the future, the memory controller, will arbitrate from 93 00:09:27,116 --> 00:09:33,641 the bus . Well instead of having p1 arbitrate as the memory controller and it 94 00:09:33,641 --> 00:09:38,424 will have something like load response. And then load response will say maybe it 95 00:09:38,424 --> 00:09:43,726 will the address. Maybe it won't. Probably has to drive something here so that it 96 00:09:43,726 --> 00:09:48,855 knows, so that the system knows which response that it is memory transactions 97 00:09:48,855 --> 00:09:53,926 happening. And then finally . So to sum up here you can actually have multiple 98 00:09:53,926 --> 00:09:59,228 transaction to the bus and have what we think of as one transaction just a load 99 00:09:59,228 --> 00:10:05,542 from memory actually split into multiple transactions. They call the split paid 100 00:10:05,542 --> 00:10:07,620 transaction .