1 00:00:03,240 --> 00:00:08,635 And I just wanted to say quick note about interconnection networks. if you guys are, 2 00:00:08,635 --> 00:00:13,444 get interested in interconnection networks as we go along, I highly recommend this 3 00:00:13,444 --> 00:00:18,370 book. I have not assigned anything from this book for this class. But this is Bill 4 00:00:18,370 --> 00:00:23,003 Dalley and Brian Towles' interconnections book. It's quite good. It's kind of the 5 00:00:23,003 --> 00:00:29,550 definitive guide on the subject matter. Okay. So, interconnection networks. We, we 6 00:00:29,550 --> 00:00:37,200 talked about buses and we talked about memory protocol about buses. That's only 7 00:00:37,640 --> 00:00:45,875 one way to share information. Now, people may argue that's a intuitive way to share 8 00:00:45,875 --> 00:00:51,481 information cuz we had the ability to do load and storage from our processor. But 9 00:00:51,481 --> 00:00:56,824 there are other ways to share information. So, in today's lecture, we are going to 10 00:00:57,276 --> 00:01:03,222 talk about two main pieces of two main topics. One, how do you share information 11 00:01:02,319 --> 00:01:08,039 in, in a different way which commingles the movement of data with a 12 00:01:08,039 --> 00:01:16,477 synchronization primitive. We are going to call that messaging, which is in contrast 13 00:01:16,477 --> 00:01:22,856 to memory or communicating via memory addresses. We're also going to talk about 14 00:01:22,856 --> 00:01:27,940 different ways to connect together processors, which can either have better 15 00:01:27,940 --> 00:01:34,516 performance or better scalability, i.e., have more nodes in the system. And, okay, 16 00:01:34,516 --> 00:01:43,378 so let's, let's compare and contrast buses to other forms of network and see why we 17 00:01:43,378 --> 00:01:48,339 might want to change. So, this, this is going back to what we had just talked 18 00:01:48,339 --> 00:01:53,697 about. Let's say, you have two cores on a bus. And let's forget about how you want 19 00:01:53,300 --> 00:01:58,857 to communicate. It could either be via shared memory or it could be via messaging 20 00:01:58,857 --> 00:02:04,149 or it could be via some other protocol, Ethernet, whatever, whatever you want put 21 00:02:04,149 --> 00:02:09,051 here, which is basically a form of messaging. We have one core and it wants 22 00:02:09,051 --> 00:02:14,212 to communicate with another core. Note, I don't draw any caches here cuz there may 23 00:02:14,212 --> 00:02:19,067 not be caches, there may be caches. It's doesn't, it's kind of immaterial here. If 24 00:02:19,067 --> 00:02:23,676 one person wants to talk to another person, they can just yell at the other 25 00:02:23,676 --> 00:02:28,469 person. It's two people so there's, it's pretty easy to do. We, now, there is some 26 00:02:28,469 --> 00:02:32,709 challenges, we can 't both talk at the same time. We might not be able to 27 00:02:32,709 --> 00:02:37,502 understand each other. So, there's some arbitration that needs to happen. But, in 28 00:02:37,502 --> 00:02:41,926 general, that arbitration is pretty simple. Only two, two cores or entities on 29 00:02:41,926 --> 00:02:48,479 this bus. Okay. Now we, now we go to more cores. So, we have four people in a room 30 00:02:48,479 --> 00:02:54,262 trying to shout to each other. Well, or four people on a bus trying to shout to 31 00:02:54,262 --> 00:02:59,140 each other. And as, as we just talked about, the bandwidth can be a challenge 32 00:02:59,140 --> 00:03:04,603 here. The arbitration for the bus can be a challenge. And because we're talking about 33 00:03:04,603 --> 00:03:10,418 interconnection networks, the wire delay and capacitance of the network can be 34 00:03:10,418 --> 00:03:16,740 worse or it can be a challenge here. So, if we got one core, it needs to drive the 35 00:03:16,740 --> 00:03:23,140 shared multidrop bus. There's a lot of capacitance on this bus, much more so than 36 00:03:23,140 --> 00:03:29,067 this case cuz all of a sudden, we've, we've, doubled the length of the bus, so 37 00:03:29,067 --> 00:03:34,836 the wires are longer and we've also put more loads on the bus. So, there's 38 00:03:34,836 --> 00:03:42,764 actually more capacitance on this bus. Okay. Now, we start to think about trying 39 00:03:42,764 --> 00:03:48,599 to build a bus that has a lot more cores. In this case, twelve. And through this 40 00:03:48,599 --> 00:03:53,802 core, if you go shot to that core, there's no pipelines around on this bus or 41 00:03:53,802 --> 00:03:58,807 anything. You go to shout and has to propagate all the way down here and, you 42 00:03:58,807 --> 00:04:03,483 know, we're, we're talking about high rates of communication. You actually have 43 00:04:03,483 --> 00:04:09,063 to wait for the time of flight of light from here to get down to there. And 44 00:04:09,063 --> 00:04:13,356 because we're, if we're using something, let's say, like a snoopy protocol or a 45 00:04:13,356 --> 00:04:20,174 broadcast protocol, because that's all we have here, we have to wait for and node 46 00:04:20,174 --> 00:04:23,863 here to communicate with every other node. So, we have to wait for the worst case 47 00:04:23,863 --> 00:04:29,668 time for this node to communicate to that node, every clock cycle. Hm, okay. and as 48 00:04:29,668 --> 00:04:35,022 I said there is capacitance, so it's not quite a, just a transmission line, so it's 49 00:04:35,022 --> 00:04:40,823 not just a transmission line problem here. We also have to worry about the 50 00:04:40,823 --> 00:04:45,969 capacitance in trying to drive all of these different receivers. And it's a 51 00:04:45,969 --> 00:04:51,870 multidirectional bus so we have to have effectively tri-states and t he ability to 52 00:04:51,870 --> 00:04:57,073 drive or just receive. Well, all of a sudden, we have twelve people and 53 00:04:57,073 --> 00:05:01,875 actually, we have twelve people in this room. So, let's all try to pick a number 54 00:05:01,875 --> 00:05:06,616 between one and ten and shout it real fast on the count of three. One, two, three, 55 00:05:06,616 --> 00:05:11,049 five. Okay. I could, I, I do, I shouted five, I don't know what everyone else 56 00:05:11,049 --> 00:05:15,174 said. So, that's does anyone, could everyone hear everyone else's? Does 57 00:05:15,174 --> 00:05:20,099 everyone know exactly what all other ten people said at the same time, or twelve 58 00:05:20,099 --> 00:05:24,902 people said at the same time? You heard your nearest neighbor. Okay, but did you 59 00:05:24,902 --> 00:05:33,183 know, do you know what Yankey said? Yeah. Okay. So, this is, this is the challenge. 60 00:05:33,183 --> 00:05:37,840 And if we need to guarantee that only one person can yell on the bus at a time, we 61 00:05:37,840 --> 00:05:42,553 need some arbitration. But the arbitration logic is slower now because we have lots 62 00:05:42,553 --> 00:05:47,491 of people communicating so we have to run a wire from this node down to this node 63 00:05:47,491 --> 00:05:52,202 and then, we had to come back in the arbitraration, logical, say, over here 64 00:05:52,202 --> 00:05:57,502 that needs to make some decision. And the decision is slower because as more layers 65 00:05:57,502 --> 00:06:02,475 of logic, more combination of logic, we will say, to make arbitration decision. 66 00:06:02,475 --> 00:06:06,598 Hm, okay. Now, if we go to a thousand processors or a, a thousand cores on a 67 00:06:06,598 --> 00:06:11,170 bus, you know, we, we could even have twelve people in the room shout at the 68 00:06:11,170 --> 00:06:14,948 same time. You can have a thousand people in the room shout at the same time, and 69 00:06:14,948 --> 00:06:18,676 physically be distanced to the wiring between this thousand different nodes is 70 00:06:18,676 --> 00:06:22,503 going to decrease the speed of the bus significantly. So, it's some, something to 71 00:06:22,503 --> 00:06:30,671 think about. So, this, this motivates us to take the same twelve course and think 72 00:06:30,671 --> 00:06:37,312 about some other ways to connect them. Now, what I'm going to show here is a, 73 00:06:37,312 --> 00:06:46,219 what's known as a switched interconnect or sometimes known as a point-to-point link 74 00:06:46,219 --> 00:06:50,878 solution. Now, point-to-point does not mean that this core can communicate 75 00:06:50,878 --> 00:06:55,792 directly with every other core. That has, that has a different name, we'll talk 76 00:06:55,792 --> 00:07:01,427 about that later today. Instead, point-to-point just means each link, only 77 00:07:01,427 --> 00:07:04,935 has one sender and one receiver. And the n. 78 00:07:04,935 --> 00:07:12,649 You use switches along the way to make decisions and to route. So, if we look at 79 00:07:12,649 --> 00:07:17,072 this, we can actually have multiple nearest neighbor communication happening. 80 00:07:17,072 --> 00:07:21,379 So, all of a sudden, by adding this switching, we can both have connectivity 81 00:07:21,379 --> 00:07:25,977 between all the different nodes, but we can also have sort of subconversations 82 00:07:25,977 --> 00:07:31,962 happening. But this still allows for this processor here to go communicate with the 83 00:07:31,962 --> 00:07:37,380 one that's at the farthest extent. And we need to decide how to do that. Whether it 84 00:07:37,380 --> 00:07:42,865 communicates sort of this way or this way or that way or some other squiggly line. 85 00:07:43,400 --> 00:07:48,751 We can also take the same point-to-point switch interconnect network. And like a 86 00:07:48,751 --> 00:07:54,035 bus, which we can increase the width of the bus, which does not help us with the 87 00:07:54,035 --> 00:07:59,187 occupancy on the bus, we can add more networks or we can affectively add 88 00:07:59,187 --> 00:08:03,248 multiple concurrent, switching interconnection networks or we can 89 00:08:03,248 --> 00:08:09,307 increase the bandwidth on these buses. So, it's similar sorts of ideas there and 90 00:08:09,307 --> 00:08:14,915 similar sorts of bandwidth tricks you can do to increase bandwidth on buses. You can 91 00:08:14,915 --> 00:08:19,685 play on there, switch interconnection always. Okay. So, this is just a very 92 00:08:19,685 --> 00:08:24,520 broad overview. And now, we're getting into some, some more specific ideas.