1 00:00:03,580 --> 00:00:09,203 So this brings us up to what we talked about right at the end of class last time 2 00:00:09,203 --> 00:00:14,965 which was looking at an example where you let's say have a sequentially consistent 3 00:00:14,965 --> 00:00:19,790 processor. And, we take a piece of code, that we had talked about at the beginning 4 00:00:19,790 --> 00:00:24,492 of class last time, and ask ourselves. Does this still work. And what we're gonna 5 00:00:24,492 --> 00:00:29,425 do is instead of having one producer and one consumer, we're gonna have one 6 00:00:29,425 --> 00:00:34,555 producer and two consumers, or multiple consumers. So as shown here, we have one 7 00:00:34,555 --> 00:00:39,751 producer. It has a register where it keeps the tail pointer. There's, there's two 8 00:00:39,751 --> 00:00:44,684 different pointers here in memory. The actual data storage of the queue, two 9 00:00:44,684 --> 00:00:49,287 consumers. They have their own register sets, 'cuz they're two different 10 00:00:49,287 --> 00:00:54,545 processors, or two different complete, threads. So they have registers which they 11 00:00:54,545 --> 00:00:59,111 cache effectively. It's not a cache, but they effectively have to load, load it 12 00:00:59,111 --> 00:01:03,796 into their register space. And then they actually have where they will load the 13 00:01:03,796 --> 00:01:10,003 actual value that they pop off the ends of the queue. So someone I think said this at 14 00:01:10,003 --> 00:01:15,956 the end of class here. But if you're going to be executing the producer, this piece 15 00:01:15,956 --> 00:01:21,764 of producer code, and you're going to execute this consumer code does anyone see 16 00:01:21,764 --> 00:01:27,426 a problem here of something that can happen that we probably don't want for a 17 00:01:27,426 --> 00:01:33,234 first in, first out queue with multiple readers. Yeah so let's, let's walk through 18 00:01:33,234 --> 00:01:39,631 that. Let's say two consumers, our execute, we, we interweave two consumers 19 00:01:39,631 --> 00:01:46,481 executing the same instruction, from one consumer thread and the other consumer 20 00:01:46,481 --> 00:01:53,330 thread every other cycle. So we have, the first consumer will read the head pointer 21 00:01:53,330 --> 00:01:58,288 into the head. The second consumer will read the head pointer into its own head 22 00:01:58,288 --> 00:02:06,144 register. This consumer here will read the tail pointer into, the tail register. The 23 00:02:06,144 --> 00:02:12,525 second consumer will do that same thing. So all of a sudden the two consumers read 24 00:02:12,525 --> 00:02:17,871 the same heads and the same tail. And this is sequentially consistent cuz we've not 25 00:02:17,871 --> 00:02:23,089 done run, done no reordering between within the thread. we've just done some 26 00:02:23,089 --> 00:02:27,946 valid interlead. Now they both check this. This is the, the case which is basically 27 00:02:27,946 --> 00:02:31,454 saying is there anything in the queue. Does the head equal the tail? If the head 28 00:02:31,454 --> 00:02:34,962 equals the tail there's nothing in the queue. If there's a distance between the 29 00:02:34,962 --> 00:02:39,064 head and the tail that means there's some entries in the queue So they both fall 30 00:02:39,064 --> 00:02:43,799 through an they say, well yeah. There's, there's stuff in the queue. The next thing 31 00:02:43,799 --> 00:02:48,114 they do is they go read from the head. This is the value that they are going to 32 00:02:48,114 --> 00:02:55,006 use. Well they both just read the same head pointer. Because we are gonna execute 33 00:02:55,006 --> 00:03:00,220 from one thread and we're going to load that value into another thread. So all of 34 00:03:00,220 --> 00:03:05,628 a sudden we just read the same value twice and one is the one thread and one is the 35 00:03:05,628 --> 00:03:10,777 other thread but it was the same, same value that we get out of reading from the, 36 00:03:10,777 --> 00:03:16,470 the pointer that points to the, the, the head if you will. Now. In a perfect world, 37 00:03:16,470 --> 00:03:21,215 that would be the only bad thing that could happen in this case cuz we would 38 00:03:21,215 --> 00:03:25,774 incorrect, they'd both incorrect the headpointers, then, then they do stores 39 00:03:25,774 --> 00:03:31,196 here and effectively what would happen is they would write the same head location 40 00:03:31,196 --> 00:03:37,408 into the, the, or the same pointer if you will into, into the head. That's not so 41 00:03:37,408 --> 00:03:41,251 bad. But effectively what we did is we enqueued one thing into the queue, and 42 00:03:41,251 --> 00:03:45,399 then we got two things out of the queue. This breaks our semantics of what a queue 43 00:03:45,399 --> 00:03:52,034 should actually do. Something that could happen that's much worse here. Is that one 44 00:03:52,034 --> 00:03:56,878 of the threads, let's say another valid interweaving, is the one thread just 45 00:03:56,878 --> 00:04:04,310 stalls for a long time in here. And, like a billion cycles, it just falls for a 46 00:04:04,310 --> 00:04:09,450 billion cycles. It takes really bad cachmus or it traps into the operating 47 00:04:09,450 --> 00:04:14,657 system, or something bad happens to it. What can happen is before the update at 48 00:04:14,657 --> 00:04:20,632 the headpointer happens, or what could, what could have happened here is you could 49 00:04:20,632 --> 00:04:25,742 have something really strange happen. Or, or, you could basically have the other 50 00:04:25,742 --> 00:04:31,115 thread execute, and it could pop a whole bunch of values off the head of the queue 51 00:04:31,115 --> 00:04:36,291 in the meantime. So let's say there was 100 thi ngs in the queue. We both execute 52 00:04:36,291 --> 00:04:42,062 into here at the same time. One of the threads get stalled here for a long period 53 00:04:42,062 --> 00:04:46,896 of time. The other thread pops 100 things off the queue. But then finally, this one 54 00:04:46,896 --> 00:04:51,133 here goes and updates the head pointer again. So what's gonna happen, is 55 00:04:51,133 --> 00:04:56,027 effectively you are basically going to reset the queue and put 100 things back 56 00:04:56,027 --> 00:05:00,563 onto the queue. And, it's probably going to be garbage in memory, in the queue at 57 00:05:00,563 --> 00:05:09,801 that time. So this brings up the idea that we may not want to be concurrently 58 00:05:09,801 --> 00:05:15,305 executing even the sequentially, even the strict sequentially consistent model. This 59 00:05:15,305 --> 00:05:20,876 block of code and another threads piece of code which is executing the same, same 60 00:05:20,876 --> 00:05:26,049 thing, the, the two consumers. So we're going to call this a critical section. if 61 00:05:26,049 --> 00:05:31,488 you take an operating system's class you will seen this, seen this a bunch. And the 62 00:05:31,488 --> 00:05:36,860 idea here is that you want to atomically execute from all the consumers possible. 63 00:05:36,860 --> 00:05:41,375 This block of code here. So no two consumers could begin to execute that 64 00:05:41,375 --> 00:05:49,817 block of code at the same time. And one way to do that is you have a lock on that 65 00:05:49,817 --> 00:05:57,127 piece of code. So you grab a lock, you can go execute that piece of code, and then 66 00:05:57,127 --> 00:06:07,718 you release the lock. Questions so far? Okay, so a little bit more general notion 67 00:06:07,718 --> 00:06:15,798 of the same idea of a lock is called a semaphore. And if you've taken !!! class 68 00:06:15,798 --> 00:06:23,772 you've probably seen these strange nomenclatures of P and V semaphores. And 69 00:06:24,091 --> 00:06:32,171 if you want to go and look up where it actually comes from, it's Dutch. And P 70 00:06:32,171 --> 00:06:40,173 largely means attempt to acquire the semaphore. But, in, in, in Dutch it means 71 00:06:40,445 --> 00:06:47,235 I don't speak. Does anyone here speak Dutch? By chance? Anyone speak anything 72 00:06:47,235 --> 00:06:54,930 close to Dutch ? You do? Little bit? Okay, how do we, how do we, how do we say that? 73 00:06:54,930 --> 00:07:03,846 What, what is close to Dutch actually? German's close to Dutch? Yeah. Okay. I 74 00:07:03,846 --> 00:07:10,749 guess it's. so I don't, I don't speak Dutch or German, but it means try to 75 00:07:10,749 --> 00:07:17,100 reduce. In a semaphore you can have multiple, people trying to grab a lock at 76 00:07:17,100 --> 00:07:21,930 the same time and multiple people trying to caching to aquire that lock at the same 77 00:07:21,930 --> 00:07:26,645 time. So an example, or aquire for the same time. The example that we gave in the 78 00:07:26,645 --> 00:07:31,532 last class is that you have multiple users trying to use one I/O device, and let's 79 00:07:31,532 --> 00:07:36,788 say that there are two queues in the I/O device. You have 100 different processors 80 00:07:36,788 --> 00:07:41,860 or threads trying to use that one I/O device. You can actually have strictly 81 00:07:41,860 --> 00:07:46,999 only two acquire the I/O device, but not more than two. In a Mutex, you're only 82 00:07:46,999 --> 00:07:52,138 allowed to have one thing be able to acquire the critical section or acquire 83 00:07:52,138 --> 00:07:57,543 the lock at a time. And, hence, we have a little bit more general term for this. A P 84 00:07:57,543 --> 00:08:04,216 semaphore basically, decrements a counter. So we start off with a counter S, here. We 85 00:08:04,216 --> 00:08:11,173 put some number into it, and then we do P on that, it'll actually decrease the 86 00:08:11,173 --> 00:08:17,927 counter, and if the counter was greater than zero, we can go execute our Kregel 87 00:08:17,927 --> 00:08:25,657 section. Then we have the release for the V semaphore. And that will actually 88 00:08:25,657 --> 00:08:32,416 increment this counter by one and it'll wake up some other process if there is 89 00:08:32,416 --> 00:08:39,602 some other process waiting on that. if, if, if, I said dropped below, one it'll go 90 00:08:39,602 --> 00:08:45,200 wake up another process . And, one of the important things here is that these, both 91 00:08:45,200 --> 00:08:49,286 these P and these V semaphores must be atomic, relative to each other. So however 92 00:08:49,286 --> 00:08:53,424 you go, thinking about going to implement them, they must, they must be relative to 93 00:08:53,424 --> 00:08:57,459 each other. And this is, with respect to everything happening, everything else 94 00:08:57,459 --> 00:09:01,239 happening on your system. Otherwise, really, really bad things will start to 95 00:09:01,239 --> 00:09:06,046 happen. You'll actually, have multiple people grabbing the lock. And how you 96 00:09:06,046 --> 00:09:11,698 would typically use this is inside of your thread or your process. You're going to 97 00:09:11,698 --> 00:09:18,396 put your critical section in, put the acquire here, and the release there. And 98 00:09:18,396 --> 00:09:23,866 one of the important things that, here to note is these semaphores. You can have the 99 00:09:23,866 --> 00:09:29,473 value S here determine how many the, the initial value of S rather determines how 100 00:09:29,473 --> 00:09:34,804 many different threads or processes are able to enter the curricular section 101 00:09:34,804 --> 00:09:40,481 concurrently. If S is initially such one that means one only one thread is allowed 102 00:09:40,481 --> 00:09:46,020 to go in there and that's what's called a Mutex. so it's mutually exclusive, only 103 00:09:46,020 --> 00:09:49,966 one thread can go into a curricular section at that time.