1 00:00:03,420 --> 00:00:09,761 Okay, so now we start talking about performance of our networks, and there's 2 00:00:09,761 --> 00:00:15,606 two. Main thoughts I wanna get across here, there are two really godo ways to 3 00:00:15,606 --> 00:00:20,243 measure our networks, bef, when we talked about performance here we were talking 4 00:00:20,243 --> 00:00:25,115 about parameters rather of the toplogy now we're gonna look at the overall network 5 00:00:25,115 --> 00:00:32,033 performance. First thing is, bandwidth. So bandwidth is the rate of data that can be 6 00:00:32,033 --> 00:00:40,624 transmitted over a given network link, divided by amount of time. Okay, that 7 00:00:40,624 --> 00:00:47,890 sounds pretty reasonable. Latency is how long it takes to communicate. And send a 8 00:00:47,890 --> 00:00:53,862 completed message between a sender and a receiver, in seconds. So, the unit on this 9 00:00:53,862 --> 00:00:59,907 is seconds, the unit on this is something like bits per second, or bytes per second, 10 00:00:59,907 --> 00:01:07,910 so the amount of data per second. These two things are linked. So, if we take a 11 00:01:07,910 --> 00:01:13,870 look at something like bandwidth, it can actually effect our layency. And the 12 00:01:13,870 --> 00:01:21,431 reason for this is, , if you increase the bandwidth, you are, are going to have to 13 00:01:21,431 --> 00:01:27,790 send fewer pieces of data for a long message. Because you can send it in wider 14 00:01:27,790 --> 00:01:32,711 chunks, or faster chunks, or something like that. So it can actually help with 15 00:01:32,711 --> 00:01:37,955 latency. It can also help with latency because it can help reduce your congestion 16 00:01:37,955 --> 00:01:43,070 on your network. Now we haven't talked about congestion yet, we'll talk about it 17 00:01:43,070 --> 00:01:48,121 in a few more slides. But by having more bandwidth it will, you can effectively 18 00:01:48,121 --> 00:01:54,079 reduce the load of your network and that will decrease your will decrease the load 19 00:01:54,079 --> 00:01:57,838 in the network, and then increase the probability you're actually going to have 20 00:01:57,838 --> 00:02:03,001 two different messages contending for a same, link in the network. Leniency can 21 00:02:03,001 --> 00:02:08,306 actually affect our bandwidth, which is, interesting, or rather it can affect our 22 00:02:08,306 --> 00:02:13,600 delivered bandwidth. It's not gonna make our, if we change the latency, it's not 23 00:02:13,600 --> 00:02:18,317 gonna make our lengths wider or our clock speed of our latency faster, but if you 24 00:02:18,317 --> 00:02:23,092 make the delivered bandwidth higher. Now how this can happen is let's say you have 25 00:02:23,092 --> 00:02:28,799 something like a round-trip Your trying to communicate from point A to point B, back 26 00:02:28,799 --> 00:02:33,886 to point A. And, this is pretty common, you want to send a message from one node 27 00:02:33,886 --> 00:02:38,814 to another node, it's gonna do some math on it. It's gonna do some work it and it's 28 00:02:38,814 --> 00:02:44,133 gonna send back the reply, and if you can't cover that latency, if the latency 29 00:02:44,133 --> 00:02:47,692 were to get longer, the sender will sit there and just stall more, and it will 30 00:02:47,692 --> 00:02:51,805 effectively increase the it will decrease the bandwidth of the amount of data that 31 00:02:51,805 --> 00:02:58,129 can be sent. Now if you are good at hiding the, this latency by doing other work, 32 00:02:58,129 --> 00:03:04,284 that may not happen. You may not be limited by latency. But then, another good 33 00:03:04,284 --> 00:03:10,704 example is if you are worried about end to end flow control. So a good example of 34 00:03:10,704 --> 00:03:16,970 this is in TCP/IP networks. So like our ethernet networks. There's actually a 35 00:03:16,970 --> 00:03:22,931 round trip flow control between the two end points, which rates limits, the, the 36 00:03:22,931 --> 00:03:29,042 bandwidth. And it's actually tied to the latency. Because you need to have, more, 37 00:03:29,042 --> 00:03:35,184 traffic in flight to cover the round trip lanes see, and this starts to get, in j, 38 00:03:35,406 --> 00:03:41,326 starts to be called, what's called the ben with delay product, where you multiply 39 00:03:41,326 --> 00:03:47,468 your been with by the delay or the latency of your network, and if you increase the 40 00:03:47,468 --> 00:03:55,398 latency. The bandwidth will effectively go down if you do not allow for more traffic 41 00:03:55,649 --> 00:04:02,713 in flight before you wait, before you can hear a float control response. . So you'll 42 00:04:02,713 --> 00:04:07,311 see this if you have, let's say, two points on the internet. And you put'em 43 00:04:07,311 --> 00:04:11,972 farther apart. And you have the same amount of in flight, data, or what's 44 00:04:11,972 --> 00:04:16,947 called, the window is the same. . The bandwidth is going to go down as you 45 00:04:16,947 --> 00:04:21,860 increase the latency. But, if you were to increase the window, it would actually, 46 00:04:22,238 --> 00:04:26,143 stay high. Cuz bandwidth delayed, probably. And the reason for that is you'd 47 00:04:26,143 --> 00:04:34,873 be waiting for acts to come back from the, the receive side. Okay, so let's take a 48 00:04:34,873 --> 00:04:43,815 look at, an example here, to understand these different parameters. We have a four 49 00:04:43,815 --> 00:04:53,292 node omega network here, with two inputs, two output, routers. Each of these circles 50 00:04:53,292 --> 00:04:59,152 here, represents input nodes, and these are the output nodes, and they basically 51 00:04:59,152 --> 00:05:05,086 wrap around, they're the same sort of thing, . We have little slashes here, 52 00:05:05,086 --> 00:05:10,871 which we'll represent as serializers and deserializers. So what this means is, 53 00:05:10,871 --> 00:05:17,420 you're transmitting some long piece of data, and it gets sent as smaller fits, if 54 00:05:17,420 --> 00:05:23,382 you will. So we're setting let's say a 32 bit word, and it gets serialized into four 55 00:05:23,382 --> 00:05:27,359 8-bit chunks across our network, across the links, because the links in the 56 00:05:27,359 --> 00:05:33,012 network are only four, or excuse me, eight bits wide, we'll say. . And in this 57 00:05:33,012 --> 00:05:43,110 network we're going to have our latencies be, non, non-unit. So let's say our link 58 00:05:43,110 --> 00:05:49,611 traversal here, each link here takes two cycles, takes L0 and L1. And our routers 59 00:05:49,611 --> 00:05:56,030 take three cycles. R0, R1, and R2. And, to go from any point to any other point in 60 00:05:56,030 --> 00:06:02,449 this network, you have to go through two routers and one link. So we can draw a 61 00:06:02,449 --> 00:06:09,579 pipeline diagram for this. So for a given packet, we can see, let's say, it can 62 00:06:09,579 --> 00:06:16,847 split into four fits here, of the head fit, two body fits and a tail fit. We 63 00:06:16,847 --> 00:06:22,622 started the source and we sent, well it takes three cycles to make a routing 64 00:06:22,622 --> 00:06:28,852 decision through here, two cycles across the link, three cycles across one of these 65 00:06:28,852 --> 00:06:35,442 routers here and then we get to the destination. And if we look at this in 66 00:06:35,442 --> 00:06:40,434 time, it's pipelined. We can have multiple of these things go down at the same time. 67 00:06:40,434 --> 00:06:45,304 So if you have the next FLITs one cycle off, or one cycle delayed. And the reason 68 00:06:45,304 --> 00:06:50,996 we wanna draw this, is we wanna look at what our latency is for. This send sending 69 00:06:50,996 --> 00:06:55,015 this one packet. Cuz it's a little bit hard to reason about, because we're 70 00:06:55,015 --> 00:07:01,204 effectively, have a pipeline here. We're overlapping different things. And we'll 71 00:07:01,204 --> 00:07:05,447 see that one of the things you'd think would be up there doesn't show up down 72 00:07:05,447 --> 00:07:09,909 here. So first let's take a look at this, we have four cycles here at the beginning 73 00:07:09,909 --> 00:07:14,534 which is just our serialization of a to z or the length of the packet divided by the 74 00:07:14,534 --> 00:07:18,845 bandwidth of the packet. If you were to increase the bandwidth here, the 75 00:07:18,845 --> 00:07:26,096 serialization latency would go down. Then we have, time in the router, which is our, 76 00:07:26,096 --> 00:07:32,566 router pipeline latency. So it's three cycles here, and another three cycle s in 77 00:07:32,566 --> 00:07:40,184 the second, router. And if we have more hops, this will go up. And then two cycles 78 00:07:40,184 --> 00:07:46,874 here for the channel latency which we'll tall, call t c. So, you can see that it's 79 00:07:46,874 --> 00:07:50,434 the summation of all of these different latencies, is our latency, but what is 80 00:07:50,434 --> 00:07:55,909 interesting to see is that there is no deserialization latency here. So, that's 81 00:07:55,909 --> 00:08:00,821 the one that's missing, and it's because we've overlapped that because it's 82 00:08:00,821 --> 00:08:12,165 pipelined, we're counting that in the serialization latency. Questions about 83 00:08:12,165 --> 00:08:26,471 that so far. Okay, so now let's take a look at our message latency, and go into a 84 00:08:26,471 --> 00:08:33,529 little more detail here. If you look at our overall latency which we'll denote as 85 00:08:33,529 --> 00:08:42,481 t, it's the latency for the head to get to the receiver, so that's all of this stuff 86 00:08:42,481 --> 00:08:54,804 here, plus the serialization latency. Now, T head has our TC and our TR, and a number 87 00:08:54,804 --> 00:09:02,795 of hops, but it also has something here that is a contention, which we haven't 88 00:09:02,795 --> 00:09:07,157 shown. So, in this number here, there was no contention. This is an unloaded 89 00:09:07,157 --> 00:09:11,698 network. There was not multiple nodes or multiple messages trying to use one 90 00:09:11,698 --> 00:09:18,279 outbound link or use any one given link in this design. But it can happen. Let's say 91 00:09:18,279 --> 00:09:22,857 these two nodes here send at the same time and they both need to use this link. 92 00:09:22,857 --> 00:09:30,044 You're gonna get contention, and that will increase our latency. But if we, if we 93 00:09:30,044 --> 00:09:35,776 rule out the contention for a little bit of time, we'll start to see the unloaded 94 00:09:35,776 --> 00:09:40,603 latency here. And we just decompose this into sort of sub-components here that we 95 00:09:40,603 --> 00:09:53,020 have the routing time, times the number of router hops that we need to go, plus the 96 00:09:53,800 --> 00:09:59,182 channel latency times the number of channel links we need to hop across, plus 97 00:09:59,182 --> 00:10:05,166 the serialization latency. And the reason we decomposed this, is this lets us reason 98 00:10:05,166 --> 00:10:10,528 about how to make networks faster. So we can see that there's a couple different 99 00:10:10,528 --> 00:10:15,708 ways to make our networks faster. First thing we do is we make shorter routes. 100 00:10:15,708 --> 00:10:20,668 That'll decrease both these upper case H's here. The reason I have two different 101 00:10:20,668 --> 00:10:25,629 upper case H's is it's. As you can see here, In this example, we basically went 102 00:10:25,629 --> 00:10:30,837 two router hops and one link hop. usually, they're connected, though. If you have to 103 00:10:30,837 --> 00:10:37,322 go farther, you need more links, and you need more router hops. You can make the 104 00:10:37,322 --> 00:10:42,099 routers faster. So you can either increase the clock frequency of the routers, you 105 00:10:42,099 --> 00:10:47,289 can make them wider if they take multiple cycles. . Now, if they're already, sort 106 00:10:47,289 --> 00:10:52,007 of, as fast as you can go, it may be hard. You might be able to increase the clock 107 00:10:52,007 --> 00:10:56,489 frequency somehow, but it is, it could start to get hard at some point, if there 108 00:10:56,489 --> 00:11:00,886 are already wide channels, and wide muxes, and have a fast clock rate. Faster 109 00:11:00,886 --> 00:11:05,418 channels. So if you go in between multiple chips, usually you're limited sort of by 110 00:11:05,418 --> 00:11:09,838 the signal integrity of the communication links between the different chips. And 111 00:11:09,838 --> 00:11:14,204 this sometimes even happens on chip. So you have to think about that, that going 112 00:11:14,204 --> 00:11:18,348 to a higher clock frequency could be, could be problematic. But if you make a 113 00:11:18,348 --> 00:11:24,257 faster channel, your latency's gonna go down, and then, finally, this is our 114 00:11:24,257 --> 00:11:29,021 serialization sort of cost here. And we bake into it here either wider channels or 115 00:11:29,021 --> 00:11:33,726 shorter messages. Maybe you have a lot of overhead on each message. You have a, a, a 116 00:11:33,726 --> 00:11:38,258 really big header for the message. If you try to shrink that, that'll make your 117 00:11:38,258 --> 00:11:43,138 network go faster and reduce your latency. Just by sending less work, or sending less 118 00:11:43,138 --> 00:11:48,250 data, but that may not always be possible. but I'll give you an example of this. If 119 00:11:48,250 --> 00:11:53,247 you look at something like, TCP on top of IP networks, in sort of our internet class 120 00:11:53,247 --> 00:11:57,866 networks, people have proposed a whole bunch of, revisions to that, where they 121 00:11:57,866 --> 00:12:02,547 try to sort of squeeze out some bytes. Or use some sort of encoding standards to 122 00:12:02,547 --> 00:12:07,287 reduce the amount of data in the headers. Because TCP header's pretty, pretty long, 123 00:12:07,287 --> 00:12:11,852 for instance. And you already see a good example of that, they actually have an 124 00:12:11,852 --> 00:12:16,475 optional field in TCP headers, which you, is typically not sent, thereby reducing 125 00:12:16,475 --> 00:12:28,757 that in the common case. Okay, so now let's talk about the effects of 126 00:12:28,757 --> 00:12:42,817 congestion. , so what I, what I drew here is. A plot of our latency versus the 127 00:12:42,817 --> 00:12:51,960 amount of bandwidth that is a chieved, or offer bandwidth. And this is for a given 128 00:12:51,960 --> 00:13:01,597 network. So it's pretty common that as you increase the bandwidth that you're using 129 00:13:01,597 --> 00:13:07,640 at any given network, the latency of the network goes up because you start to see 130 00:13:07,640 --> 00:13:15,052 more congestion in the network. So the probability that any two points are can, 131 00:13:15,052 --> 00:13:22,051 are contended for will go up as you get closer to the maximum amount of maximum 132 00:13:22,051 --> 00:13:28,485 achieve in the bandwidth in the bandwidth. Now there are some networks that people 133 00:13:28,485 --> 00:13:35,180 build where this is not the graph does not look like this or does not look like that. 134 00:13:35,180 --> 00:13:39,030 So for instance, if you have a start apology, you don't have any congestion. 135 00:13:39,030 --> 00:13:43,089 So, you're going to get something that looks much more like an ideal plot here 136 00:13:43,089 --> 00:13:48,016 where if you're going to have a straight line and another straight line. Because as 137 00:13:48,016 --> 00:13:53,398 you increase your load to the network, everyone can send to everyone else, so 138 00:13:53,398 --> 00:14:00,332 it's not going to be congestion in the network. And I have a few lines here that 139 00:14:00,332 --> 00:14:06,347 sort of show interesting things that sort of hack down at this. So, in a perfect 140 00:14:06,347 --> 00:14:12,362 world you'd have your zero load latency, so this is the latency of the unloaded 141 00:14:12,362 --> 00:14:20,830 network, and as you increase the bandwidth. It wouldn't change. On our 142 00:14:20,830 --> 00:14:29,320 unloaded network. And with so. And, and if you had no, no conjecture in the network. 143 00:14:30,440 --> 00:14:35,120 But we start and, and but that's, that's not usually what you see on, on real world 144 00:14:35,120 --> 00:14:42,393 networks. Couple of things sort of, also, decrease or increase the latency and 145 00:14:42,393 --> 00:14:48,918 decrease the bandwidth of a network. Usually, you have some routing delay that 146 00:14:48,918 --> 00:14:53,661 gets introduced into the network. And that's going to basically push us away 147 00:14:53,661 --> 00:14:58,777 from higher bandwidth and lower latency. So you wanna be farther down in this plot, 148 00:14:58,777 --> 00:15:04,019 'cause that's lower latency. . And also, if you have flow control in the network. 149 00:15:04,019 --> 00:15:08,762 So, local flow control. That also looks like, some form of congestion. It'll 150 00:15:08,762 --> 00:15:14,860 actually slow down your network in certain cases. But I just wanted to give you guys 151 00:15:14,860 --> 00:15:19,141 the idea here that. For any, real world network it usually. Looks something like 152 00:15:19,141 --> 00:15:23,641 this. And as you get closer and closer, to using the whole network. You're using all 153 00:15:23,641 --> 00:15:27,867 the bits available, by the network. The latency starts to shoot asymptotically, 154 00:15:27,867 --> 00:15:28,800 through the roof.