Okay, so now I'll start talking about, we're gonna, we're gonna spend the rest of the lecture talking about different coherence protocols, and relative merits of them, on a bus. I wanted to contrast this to what we're gonna talk about in two lectures, where we're gonna be talking about different coherence protocols across switch interconnects, and places where you don't have a shared medium or a shared broadcast medium. Okay. So let's, as a warm-up here, we're gonna start off by looking at what can happen with. We know happen at the same time as memory transactions from a uni-processor. So, let's take a look at where you can have consistency problems in a uni-processor system. As a warm-up and a motivator here. So here, we have a processor, and that's a cache. This is memory. And then, somewhere in the memory bus here, we off a bit with a DMA agent. Or a direct memory access agent. Sometimes it's called, All semester, even. Because it, it has a term, because of multiple agents which can effectively drive transactions onto main memory clusters. So if you go look at like, for instance, PCI Exu2014, PCI or PCI Express, which are sort of the extension cards for your system, they'll, they'll use the term, bus mastering. What that really means is that there's a DMA engine out at the I/O place. Okay, so, what I'm trying to get across here is, you can actually. overlap in a uni-processor system. Moving data from the disk to main memory without having to use the processor. Because you have . Originally, or, or, . And require program to IO in order to go access the disk. So a simple way the processor would actually read an address, which is translated. And you don't have the here. Some . This impostor and possibly . As you can tell that requires you to this memory. It's kinda slow. So people decided let's put extra direct memory access engines out at diode prices. So, this actually goes back to early, early computers. Like mainframes like very sophisticated the, the, DMA. And they're, they're, they were, they're obvi ously . System 360 they have, coviariable DMA engines that they can effectively . But the simplest case given is basically going to have a register which says where we provide disk and where memory and how long and, and a, and a go button, where go, you could sit there . You could also possibly do it the other way. Okay, so let's look at this from a coherence perspective. And let's choose this cache. Let's look a memory to disk transaction. So we look we look at the DMA it should say having this location in memory into this location of the disk. We tell it to go. Now while it's doing that the processor writes to an address that tells it to... Inside this page. What happens? It's not write through. There's no coherence protocol really going on in this case so far. Well, values, hopefully. Or maybe it does, these cache lines do out to main memory. So then it gets some of the new values, and some of the old values. The point I'm trying to make here is that, you don't know. And what's gonna happen? and that, that's a little scary. You wanna, you know? You wanna in your system. Now, you can say, well maybe the processors shouldn't go and write to that, those memory addresses. That is a valid solution. this is actually pretty common on, on modern day systems. Is that the OS will know when the actually is in flight. They will just make sure not to go access that, those memory addresses. Well, what I was trying to introduce here is that you actually processor system, can have coherence problems, with respect to. Where the different addresses are. You can actually have . . You can also have it going the other way. To the disks to main memory transferring in that direction. Well, let's say there's some values in the cache of the CPU. This is actually probably the more interesting case, is that you have some data in the CPU's cache, erasing the disk to physical memory. But all of a sudden, the CPU's cache. Doesn't pick up the new value. And it wants to go and read that. It's like reading a file off a disk or some thing. This is gonna get the wrong value. So this introduces, this moves us to our first idea in our coherence protocols, that has a funny name, called snoopy caches. no, this is not named for the, the dog in the Peanuts cartoons. But instead, Stu Goodman and, I believe it's a professor now and one of his students, came up with the idea that you have the cash on the what's going on and effectively update the cash with the, the data that's flying by on the bus. So, if we look at this from a little bit more harder perspective. . We have our cache and we have the tags. And escape into the cache. And it is effectively sitting there watching the bus. And if a address that is in the cache slides behind the bus, it needs to do something about it. It probably needs to invalidate the address if its a right occurring across the bus. You can also do it the other way that if you have a DNA engine which is reading from a memory and the data is dirty in cache and it's not a memory, it's a right back cache, it may need to provide data to the, the I O device that's trying to read from a memory and override effectively where its coming from main memory. Now it probably doesn't wanna try that after you're on the bus but there's also arbitration that's actually, actually happening there. To determine who has the actual data. . But you have to figure out sort of what is, what is the right fitting? You know, we're talking a little bit more in today's lecture, what is the right thing to do in these, these interesting formative cases? Before we move on here, this, this is getting a little bit hard. maybe back in 1983, this wasn't so bad. But nowadays, we just tags and our . A dual coordinate.'Cause in the Snoopy protocol, we're gonna need all possible memory, transactions that are going on, let's say, by one processor. And/or and DMA engine, to be verified by every other processor entity in the system. That's a fair amount of bandwidth coming into here. And you could add two ports to that. And that's okay if the cache is sort of f arther out. But it might slow down your cache . Level one cache. So typically the way that people build this is they'll try to maybe move forward or snoop on a level two cache and have level one cache which is not necessarily snooped. Now this is where we get back to inclusive versus exclusive caches. If you're level two cache is inclusive of all data in level one this is actually not so bad to do. Because you are guaranteed to the tags in level two cache. Global one Cash. So, you don't have to go, move all the way down for cash. If you use exclusive cash, well, life gets a lot harder. Cuz basically you need a check with a Global one emblem to . so, anyways, all I was trying to get across here's that this, this significantly increases the price of or cash design here, your tab design as you add more portions to this. And it's not actually an area question, I mean. Just makes it larger, puts more of, a clock cycle perfomance, specially if it's, your global one captioning, where two important task. That's a critical tacking your processor to go, and certain one, two, and anything else that too. One ways all this is actually, you have a unique, ported tax structure and you somehow delay, the rest of, the, catch soup transaction, while, you arbitrate and wait for time, in order to go access the tax. So you. But you can, multiplex let's say the, the one portion of the tag, every other cycle. One cycle's the main processor. One cycle is for the transaction. . So just, have a little more of block diagram view of this. We have a bus, of multi-processors, and our snoopy cache, which actually has to see, all of the, items. No traffic. And in all the main processor traffic across this bus. So that's a, a lot of bandwidth because essentially you're broadcasting... Well, you might have to broadcast all of your actions from one processor to all the, all the other processors. But we're looking at techniques to reduce the requirements of, of this broadcast to be the subset of, of the data. Okay so questions so far? , and, and adding supports before going into protocols.