1
00:00:03,200 --> 00:00:06,643
Okay.
This brings us to our next optimization,

2
00:00:06,643 --> 00:00:11,665
which is the amount of [inaudible]
optimizations and this is one that

3
00:00:11,665 --> 00:00:15,395
actually is no longer talked about in your
textbook.

4
00:00:15,395 --> 00:00:20,346
The old versions of your textbook did talk
about this optimization.

5
00:00:20,561 --> 00:00:25,870
I, I think it's still an important,
interesting optimization to think about

6
00:00:25,870 --> 00:00:29,026
and this is something called a victim
cache.

7
00:00:29,026 --> 00:00:33,402
The victim cache, let's take a look at the
picture first. Here,

8
00:00:33,402 --> 00:00:36,990
Cpu, L1 cache, L2 cache,
It's a multilevel cache.

9
00:00:36,990 --> 00:00:45,512
But now.,
If you're missing your L1, you put another

10
00:00:45,512 --> 00:00:51,860
little spot check.
We put something we call a victim cache

11
00:00:51,860 --> 00:00:57,944
and a victim cache is a fully associative
cache but it's usually very small.

12
00:00:57,944 --> 00:01:02,133
So, let's say, one, two, three, four
entries, something like that.

13
00:01:02,133 --> 00:01:07,368
Seems very small cache structure here, but
it's fully associative.

14
00:01:07,368 --> 00:01:13,302
But what's nice about this is sometimes in
programs you have a problem where, I don't

15
00:01:13,302 --> 00:01:18,607
know, let's say, you have a two-way set
associative L1, which you have three

16
00:01:18,607 --> 00:01:23,633
things that want to be in that one
location that has one index to cache.

17
00:01:23,633 --> 00:01:28,660
This keep fighting and knocking each other
out over and over and over again.

18
00:01:29,400 --> 00:01:35,222
Just actually one extra fully associative
entry would solve this.

19
00:01:35,222 --> 00:01:41,402
And then you have the edit here in the way
zero, way, way one, and now it's victim

20
00:01:41,402 --> 00:01:46,949
cache.
So this is actually the, the first

21
00:01:46,949 --> 00:01:51,501
processor that showed up in was an HP
PA-risc processor.

22
00:01:52,742 --> 00:01:57,541
Probably HP 7,000 per [inaudible] 7,000
series processor.

23
00:01:57,541 --> 00:02:02,920
But, you can actually have this, this
little extra cache here.

24
00:02:03,168 --> 00:02:09,577
A couple extra little things you need to,
to think about though, is, whether you

25
00:02:09,577 --> 00:02:14,865
check this cache in series or in parallel
with the L1.

26
00:02:14,865 --> 00:02:19,680
If you're checking in parallel it's going
to hurt your hit time.

27
00:02:19,680 --> 00:02:24,733
We've checked with the series, checked
this first, then checked this, and then

28
00:02:24,733 --> 00:02:27,925
checked the L2.
You know, checking these together so we'd

29
00:02:28,324 --> 00:02:33,377
have a worst hit time, like we need your
L1 to sort of pass it along the way it

30
00:02:33,377 --> 00:02:37,700
was, so you don't have to put multiplexer
and, after your L1 checked.

31
00:02:37,700 --> 00:02:44,081
It's a trade off. People, well,
Well, checking both in parallel and in

32
00:02:44,081 --> 00:02:49,137
serial.
Let's look at the mechanics of how this

33
00:02:49,137 --> 00:02:53,285
works.
So on a this inner L1 we're we generate

34
00:02:53,285 --> 00:02:57,731
the victim,
Gets your transition the victim to here

35
00:02:57,731 --> 00:03:03,020
and then we bring in the new data from
here into the L1.

36
00:03:03,600 --> 00:03:08,679
Unless, the data is in the victim cache
already and then all we do is basically

37
00:03:08,679 --> 00:03:12,215
swap the two,
Take the lines in the victim and lines in

38
00:03:12,215 --> 00:03:17,229
the data, and swa, but you usually have
sort of separate data paths in that.

39
00:03:17,229 --> 00:03:23,665
You can have going on simultaneously.
So you could, after one big, we are one at

40
00:03:23,665 --> 00:03:30,323
the other, and vice versa.
If you missing the victim cache, missing

41
00:03:30,323 --> 00:03:37,742
the L1, interesting question comes up here
with your data that's in the victim cache.

42
00:03:37,742 --> 00:03:49,325
What happens to that?
We put a victim in the victim cache.

43
00:03:49,325 --> 00:03:53,611
Do we need to do something with that
victim first?question

44
00:03:54,382 --> 00:03:56,018
Mark here.
Hm,,

45
00:03:56,216 --> 00:03:58,596
Unclear.
It really depends on design.

46
00:03:58,596 --> 00:04:04,281
If the design is such that when we,, when
we generate victim from the L1, we both

47
00:04:04,281 --> 00:04:08,050
put it in the victim cache and we flush it
down to our L2.

48
00:04:08,050 --> 00:04:12,170
Then, when we have to, let's say,
victimize from the victim cache, the

49
00:04:12,170 --> 00:04:17,027
victim cache would say only its two
entries, and we need to put a new victim

50
00:04:17,027 --> 00:04:20,471
in the victim cache. We can just throw on
to the ground.

51
00:04:20,471 --> 00:04:27,420
We don't actually need to deal with that
victim because the data is [inaudible] If

52
00:04:27,420 --> 00:04:32,524
not, we'd actually have to take the upper
victim cache that's possibly dirty and go

53
00:04:32,524 --> 00:04:38,141
over it out into the L2.
But in reality what this is trying to do

54
00:04:38,141 --> 00:04:42,587
is it's trying to reduce our
conflictnesses, by gaining us a pseudo

55
00:04:42,587 --> 00:04:46,304
higher associate cache with just a very
small second.

56
00:04:46,304 --> 00:04:51,481
It's interesting to look sort of a
[inaudible] and people that built on this,

57
00:04:51,481 --> 00:04:55,795
This victim cache, a very small victim
cache, can go a long way.

58
00:04:55,994 --> 00:05:01,558
Because there's a lots of times in the
program, there may be that one value, that

59
00:05:01,558 --> 00:05:08,486
one address which is what highly intended
in your cache, or this L1 line of cache.

60
00:05:08,486 --> 00:05:13,996
But everything else in the cache is a sort
of [inaudible] will say.

61
00:05:13,996 --> 00:05:18,170
And this can really help you with the
[inaudible].

62
00:05:18,170 --> 00:05:21,598
Okay, so going, going back to our victim
cache here,

63
00:05:21,598 --> 00:05:26,191
If we look at the sorta our victim cache
just by itself shouldn't be L1.

64
00:05:26,671 --> 00:05:32,431
This kind of looks like a multiple cache.
The answer is going to be the same as what

65
00:05:32,431 --> 00:05:36,887
we had for a multiple cache.
If you look at it just through the L1,

66
00:05:37,162 --> 00:05:44,878
it's going to make the Miss Penalty go
down as we [inaudible] If you look at the

67
00:05:44,878 --> 00:05:50,970
L1 with the victim cache, both the Miss
Penalty will be more than we had to go out

68
00:05:50,970 --> 00:05:52,680
to the next level cache.