1
00:00:03,420 --> 00:00:09,761
Okay, so now we start talking about
performance of our networks, and there's

2
00:00:09,761 --> 00:00:15,606
two. Main thoughts I wanna get across
here, there are two really godo ways to

3
00:00:15,606 --> 00:00:20,243
measure our networks, bef, when we talked
about performance here we were talking

4
00:00:20,243 --> 00:00:25,115
about parameters rather of the toplogy now
we're gonna look at the overall network

5
00:00:25,115 --> 00:00:32,033
performance. First thing is, bandwidth. So
bandwidth is the rate of data that can be

6
00:00:32,033 --> 00:00:40,624
transmitted over a given network link,
divided by amount of time. Okay, that

7
00:00:40,624 --> 00:00:47,890
sounds pretty reasonable. Latency is how
long it takes to communicate. And send a

8
00:00:47,890 --> 00:00:53,862
completed message between a sender and a
receiver, in seconds. So, the unit on this

9
00:00:53,862 --> 00:00:59,907
is seconds, the unit on this is something
like bits per second, or bytes per second,

10
00:00:59,907 --> 00:01:07,910
so the amount of data per second. These
two things are linked. So, if we take a

11
00:01:07,910 --> 00:01:13,870
look at something like bandwidth, it can
actually effect our layency. And the

12
00:01:13,870 --> 00:01:21,431
reason for this is, , if you increase the
bandwidth, you are, are going to have to

13
00:01:21,431 --> 00:01:27,790
send fewer pieces of data for a long
message. Because you can send it in wider

14
00:01:27,790 --> 00:01:32,711
chunks, or faster chunks, or something
like that. So it can actually help with

15
00:01:32,711 --> 00:01:37,955
latency. It can also help with latency
because it can help reduce your congestion

16
00:01:37,955 --> 00:01:43,070
on your network. Now we haven't talked
about congestion yet, we'll talk about it

17
00:01:43,070 --> 00:01:48,121
in a few more slides. But by having more
bandwidth it will, you can effectively

18
00:01:48,121 --> 00:01:54,079
reduce the load of your network and that
will decrease your will decrease the load

19
00:01:54,079 --> 00:01:57,838
in the network, and then increase the
probability you're actually going to have

20
00:01:57,838 --> 00:02:03,001
two different messages contending for a
same, link in the network. Leniency can

21
00:02:03,001 --> 00:02:08,306
actually affect our bandwidth, which is,
interesting, or rather it can affect our

22
00:02:08,306 --> 00:02:13,600
delivered bandwidth. It's not gonna make
our, if we change the latency, it's not

23
00:02:13,600 --> 00:02:18,317
gonna make our lengths wider or our clock
speed of our latency faster, but if you

24
00:02:18,317 --> 00:02:23,092
make the delivered bandwidth higher. Now
how this can happen is let's say you have

25
00:02:23,092 --> 00:02:28,799
something like a round-trip Your trying to
communicate from point A to point B, back

26
00:02:28,799 --> 00:02:33,886
to point A. And, this is pretty common,
you want to send a message from one node

27
00:02:33,886 --> 00:02:38,814
to another node, it's gonna do some math
on it. It's gonna do some work it and it's

28
00:02:38,814 --> 00:02:44,133
gonna send back the reply, and if you
can't cover that latency, if the latency

29
00:02:44,133 --> 00:02:47,692
were to get longer, the sender will sit
there and just stall more, and it will

30
00:02:47,692 --> 00:02:51,805
effectively increase the it will decrease
the bandwidth of the amount of data that

31
00:02:51,805 --> 00:02:58,129
can be sent. Now if you are good at hiding
the, this latency by doing other work,

32
00:02:58,129 --> 00:03:04,284
that may not happen. You may not be
limited by latency. But then, another good

33
00:03:04,284 --> 00:03:10,704
example is if you are worried about end to
end flow control. So a good example of

34
00:03:10,704 --> 00:03:16,970
this is in TCP/IP networks. So like our
ethernet networks. There's actually a

35
00:03:16,970 --> 00:03:22,931
round trip flow control between the two
end points, which rates limits, the, the

36
00:03:22,931 --> 00:03:29,042
bandwidth. And it's actually tied to the
latency. Because you need to have, more,

37
00:03:29,042 --> 00:03:35,184
traffic in flight to cover the round trip
lanes see, and this starts to get, in j,

38
00:03:35,406 --> 00:03:41,326
starts to be called, what's called the ben
with delay product, where you multiply

39
00:03:41,326 --> 00:03:47,468
your been with by the delay or the latency
of your network, and if you increase the

40
00:03:47,468 --> 00:03:55,398
latency. The bandwidth will effectively go
down if you do not allow for more traffic

41
00:03:55,649 --> 00:04:02,713
in flight before you wait, before you can
hear a float control response. . So you'll

42
00:04:02,713 --> 00:04:07,311
see this if you have, let's say, two
points on the internet. And you put'em

43
00:04:07,311 --> 00:04:11,972
farther apart. And you have the same
amount of in flight, data, or what's

44
00:04:11,972 --> 00:04:16,947
called, the window is the same. . The
bandwidth is going to go down as you

45
00:04:16,947 --> 00:04:21,860
increase the latency. But, if you were to
increase the window, it would actually,

46
00:04:22,238 --> 00:04:26,143
stay high. Cuz bandwidth delayed,
probably. And the reason for that is you'd

47
00:04:26,143 --> 00:04:34,873
be waiting for acts to come back from the,
the receive side. Okay, so let's take a

48
00:04:34,873 --> 00:04:43,815
look at, an example here, to understand
these different parameters. We have a four

49
00:04:43,815 --> 00:04:53,292
node omega network here, with two inputs,
two output, routers. Each of these circles

50
00:04:53,292 --> 00:04:59,152
here, represents input nodes, and these
are the output nodes, and they basically

51
00:04:59,152 --> 00:05:05,086
wrap around, they're the same sort of
thing, . We have little slashes here,

52
00:05:05,086 --> 00:05:10,871
which we'll represent as serializers and
deserializers. So what this means is,

53
00:05:10,871 --> 00:05:17,420
you're transmitting some long piece of
data, and it gets sent as smaller fits, if

54
00:05:17,420 --> 00:05:23,382
you will. So we're setting let's say a 32
bit word, and it gets serialized into four

55
00:05:23,382 --> 00:05:27,359
8-bit chunks across our network, across
the links, because the links in the

56
00:05:27,359 --> 00:05:33,012
network are only four, or excuse me, eight
bits wide, we'll say. . And in this

57
00:05:33,012 --> 00:05:43,110
network we're going to have our latencies
be, non, non-unit. So let's say our link

58
00:05:43,110 --> 00:05:49,611
traversal here, each link here takes two
cycles, takes L0 and L1. And our routers

59
00:05:49,611 --> 00:05:56,030
take three cycles. R0, R1, and R2. And, to
go from any point to any other point in

60
00:05:56,030 --> 00:06:02,449
this network, you have to go through two
routers and one link. So we can draw a

61
00:06:02,449 --> 00:06:09,579
pipeline diagram for this. So for a given
packet, we can see, let's say, it can

62
00:06:09,579 --> 00:06:16,847
split into four fits here, of the head
fit, two body fits and a tail fit. We

63
00:06:16,847 --> 00:06:22,622
started the source and we sent, well it
takes three cycles to make a routing

64
00:06:22,622 --> 00:06:28,852
decision through here, two cycles across
the link, three cycles across one of these

65
00:06:28,852 --> 00:06:35,442
routers here and then we get to the
destination. And if we look at this in

66
00:06:35,442 --> 00:06:40,434
time, it's pipelined. We can have multiple
of these things go down at the same time.

67
00:06:40,434 --> 00:06:45,304
So if you have the next FLITs one cycle
off, or one cycle delayed. And the reason

68
00:06:45,304 --> 00:06:50,996
we wanna draw this, is we wanna look at
what our latency is for. This send sending

69
00:06:50,996 --> 00:06:55,015
this one packet. Cuz it's a little bit
hard to reason about, because we're

70
00:06:55,015 --> 00:07:01,204
effectively, have a pipeline here. We're
overlapping different things. And we'll

71
00:07:01,204 --> 00:07:05,447
see that one of the things you'd think
would be up there doesn't show up down

72
00:07:05,447 --> 00:07:09,909
here. So first let's take a look at this,
we have four cycles here at the beginning

73
00:07:09,909 --> 00:07:14,534
which is just our serialization of a to z
or the length of the packet divided by the

74
00:07:14,534 --> 00:07:18,845
bandwidth of the packet. If you were to
increase the bandwidth here, the

75
00:07:18,845 --> 00:07:26,096
serialization latency would go down. Then
we have, time in the router, which is our,

76
00:07:26,096 --> 00:07:32,566
router pipeline latency. So it's three
cycles here, and another three cycle s in

77
00:07:32,566 --> 00:07:40,184
the second, router. And if we have more
hops, this will go up. And then two cycles

78
00:07:40,184 --> 00:07:46,874
here for the channel latency which we'll
tall, call t c. So, you can see that it's

79
00:07:46,874 --> 00:07:50,434
the summation of all of these different
latencies, is our latency, but what is

80
00:07:50,434 --> 00:07:55,909
interesting to see is that there is no
deserialization latency here. So, that's

81
00:07:55,909 --> 00:08:00,821
the one that's missing, and it's because
we've overlapped that because it's

82
00:08:00,821 --> 00:08:12,165
pipelined, we're counting that in the
serialization latency. Questions about

83
00:08:12,165 --> 00:08:26,471
that so far. Okay, so now let's take a
look at our message latency, and go into a

84
00:08:26,471 --> 00:08:33,529
little more detail here. If you look at
our overall latency which we'll denote as

85
00:08:33,529 --> 00:08:42,481
t, it's the latency for the head to get to
the receiver, so that's all of this stuff

86
00:08:42,481 --> 00:08:54,804
here, plus the serialization latency. Now,
T head has our TC and our TR, and a number

87
00:08:54,804 --> 00:09:02,795
of hops, but it also has something here
that is a contention, which we haven't

88
00:09:02,795 --> 00:09:07,157
shown. So, in this number here, there was
no contention. This is an unloaded

89
00:09:07,157 --> 00:09:11,698
network. There was not multiple nodes or
multiple messages trying to use one

90
00:09:11,698 --> 00:09:18,279
outbound link or use any one given link in
this design. But it can happen. Let's say

91
00:09:18,279 --> 00:09:22,857
these two nodes here send at the same time
and they both need to use this link.

92
00:09:22,857 --> 00:09:30,044
You're gonna get contention, and that will
increase our latency. But if we, if we

93
00:09:30,044 --> 00:09:35,776
rule out the contention for a little bit
of time, we'll start to see the unloaded

94
00:09:35,776 --> 00:09:40,603
latency here. And we just decompose this
into sort of sub-components here that we

95
00:09:40,603 --> 00:09:53,020
have the routing time, times the number of
router hops that we need to go, plus the

96
00:09:53,800 --> 00:09:59,182
channel latency times the number of
channel links we need to hop across, plus

97
00:09:59,182 --> 00:10:05,166
the serialization latency. And the reason
we decomposed this, is this lets us reason

98
00:10:05,166 --> 00:10:10,528
about how to make networks faster. So we
can see that there's a couple different

99
00:10:10,528 --> 00:10:15,708
ways to make our networks faster. First
thing we do is we make shorter routes.

100
00:10:15,708 --> 00:10:20,668
That'll decrease both these upper case H's
here. The reason I have two different

101
00:10:20,668 --> 00:10:25,629
upper case H's is it's. As you can see
here, In this example, we basically went

102
00:10:25,629 --> 00:10:30,837
two router hops and one link hop. usually,
they're connected, though. If you have to

103
00:10:30,837 --> 00:10:37,322
go farther, you need more links, and you
need more router hops. You can make the

104
00:10:37,322 --> 00:10:42,099
routers faster. So you can either increase
the clock frequency of the routers, you

105
00:10:42,099 --> 00:10:47,289
can make them wider if they take multiple
cycles. . Now, if they're already, sort

106
00:10:47,289 --> 00:10:52,007
of, as fast as you can go, it may be hard.
You might be able to increase the clock

107
00:10:52,007 --> 00:10:56,489
frequency somehow, but it is, it could
start to get hard at some point, if there

108
00:10:56,489 --> 00:11:00,886
are already wide channels, and wide muxes,
and have a fast clock rate. Faster

109
00:11:00,886 --> 00:11:05,418
channels. So if you go in between multiple
chips, usually you're limited sort of by

110
00:11:05,418 --> 00:11:09,838
the signal integrity of the communication
links between the different chips. And

111
00:11:09,838 --> 00:11:14,204
this sometimes even happens on chip. So
you have to think about that, that going

112
00:11:14,204 --> 00:11:18,348
to a higher clock frequency could be,
could be problematic. But if you make a

113
00:11:18,348 --> 00:11:24,257
faster channel, your latency's gonna go
down, and then, finally, this is our

114
00:11:24,257 --> 00:11:29,021
serialization sort of cost here. And we
bake into it here either wider channels or

115
00:11:29,021 --> 00:11:33,726
shorter messages. Maybe you have a lot of
overhead on each message. You have a, a, a

116
00:11:33,726 --> 00:11:38,258
really big header for the message. If you
try to shrink that, that'll make your

117
00:11:38,258 --> 00:11:43,138
network go faster and reduce your latency.
Just by sending less work, or sending less

118
00:11:43,138 --> 00:11:48,250
data, but that may not always be possible.
but I'll give you an example of this. If

119
00:11:48,250 --> 00:11:53,247
you look at something like, TCP on top of
IP networks, in sort of our internet class

120
00:11:53,247 --> 00:11:57,866
networks, people have proposed a whole
bunch of, revisions to that, where they

121
00:11:57,866 --> 00:12:02,547
try to sort of squeeze out some bytes. Or
use some sort of encoding standards to

122
00:12:02,547 --> 00:12:07,287
reduce the amount of data in the headers.
Because TCP header's pretty, pretty long,

123
00:12:07,287 --> 00:12:11,852
for instance. And you already see a good
example of that, they actually have an

124
00:12:11,852 --> 00:12:16,475
optional field in TCP headers, which you,
is typically not sent, thereby reducing

125
00:12:16,475 --> 00:12:28,757
that in the common case. Okay, so now
let's talk about the effects of

126
00:12:28,757 --> 00:12:42,817
congestion. , so what I, what I drew here
is. A plot of our latency versus the

127
00:12:42,817 --> 00:12:51,960
amount of bandwidth that is a chieved, or
offer bandwidth. And this is for a given

128
00:12:51,960 --> 00:13:01,597
network. So it's pretty common that as you
increase the bandwidth that you're using

129
00:13:01,597 --> 00:13:07,640
at any given network, the latency of the
network goes up because you start to see

130
00:13:07,640 --> 00:13:15,052
more congestion in the network. So the
probability that any two points are can,

131
00:13:15,052 --> 00:13:22,051
are contended for will go up as you get
closer to the maximum amount of maximum

132
00:13:22,051 --> 00:13:28,485
achieve in the bandwidth in the bandwidth.
Now there are some networks that people

133
00:13:28,485 --> 00:13:35,180
build where this is not the graph does not
look like this or does not look like that.

134
00:13:35,180 --> 00:13:39,030
So for instance, if you have a start
apology, you don't have any congestion.

135
00:13:39,030 --> 00:13:43,089
So, you're going to get something that
looks much more like an ideal plot here

136
00:13:43,089 --> 00:13:48,016
where if you're going to have a straight
line and another straight line. Because as

137
00:13:48,016 --> 00:13:53,398
you increase your load to the network,
everyone can send to everyone else, so

138
00:13:53,398 --> 00:14:00,332
it's not going to be congestion in the
network. And I have a few lines here that

139
00:14:00,332 --> 00:14:06,347
sort of show interesting things that sort
of hack down at this. So, in a perfect

140
00:14:06,347 --> 00:14:12,362
world you'd have your zero load latency,
so this is the latency of the unloaded

141
00:14:12,362 --> 00:14:20,830
network, and as you increase the
bandwidth. It wouldn't change. On our

142
00:14:20,830 --> 00:14:29,320
unloaded network. And with so. And, and if
you had no, no conjecture in the network.

143
00:14:30,440 --> 00:14:35,120
But we start and, and but that's, that's
not usually what you see on, on real world

144
00:14:35,120 --> 00:14:42,393
networks. Couple of things sort of, also,
decrease or increase the latency and

145
00:14:42,393 --> 00:14:48,918
decrease the bandwidth of a network.
Usually, you have some routing delay that

146
00:14:48,918 --> 00:14:53,661
gets introduced into the network. And
that's going to basically push us away

147
00:14:53,661 --> 00:14:58,777
from higher bandwidth and lower latency.
So you wanna be farther down in this plot,

148
00:14:58,777 --> 00:15:04,019
'cause that's lower latency. . And also,
if you have flow control in the network.

149
00:15:04,019 --> 00:15:08,762
So, local flow control. That also looks
like, some form of congestion. It'll

150
00:15:08,762 --> 00:15:14,860
actually slow down your network in certain
cases. But I just wanted to give you guys

151
00:15:14,860 --> 00:15:19,141
the idea here that. For any, real world
network it usually. Looks something like

152
00:15:19,141 --> 00:15:23,641
this. And as you get closer and closer, to
using the whole network. You're using all

153
00:15:23,641 --> 00:15:27,867
the bits available, by the network. The
latency starts to shoot asymptotically,

154
00:15:27,867 --> 00:15:28,800
through the roof.