1
00:00:03,580 --> 00:00:09,203
So this brings us up to what we talked
about right at the end of class last time

2
00:00:09,203 --> 00:00:14,965
which was looking at an example where you
let's say have a sequentially consistent

3
00:00:14,965 --> 00:00:19,790
processor. And, we take a piece of code,
that we had talked about at the beginning

4
00:00:19,790 --> 00:00:24,492
of class last time, and ask ourselves.
Does this still work. And what we're gonna

5
00:00:24,492 --> 00:00:29,425
do is instead of having one producer and
one consumer, we're gonna have one

6
00:00:29,425 --> 00:00:34,555
producer and two consumers, or multiple
consumers. So as shown here, we have one

7
00:00:34,555 --> 00:00:39,751
producer. It has a register where it keeps
the tail pointer. There's, there's two

8
00:00:39,751 --> 00:00:44,684
different pointers here in memory. The
actual data storage of the queue, two

9
00:00:44,684 --> 00:00:49,287
consumers. They have their own register
sets, 'cuz they're two different

10
00:00:49,287 --> 00:00:54,545
processors, or two different complete,
threads. So they have registers which they

11
00:00:54,545 --> 00:00:59,111
cache effectively. It's not a cache, but
they effectively have to load, load it

12
00:00:59,111 --> 00:01:03,796
into their register space. And then they
actually have where they will load the

13
00:01:03,796 --> 00:01:10,003
actual value that they pop off the ends of
the queue. So someone I think said this at

14
00:01:10,003 --> 00:01:15,956
the end of class here. But if you're going
to be executing the producer, this piece

15
00:01:15,956 --> 00:01:21,764
of producer code, and you're going to
execute this consumer code does anyone see

16
00:01:21,764 --> 00:01:27,426
a problem here of something that can
happen that we probably don't want for a

17
00:01:27,426 --> 00:01:33,234
first in, first out queue with multiple
readers. Yeah so let's, let's walk through

18
00:01:33,234 --> 00:01:39,631
that. Let's say two consumers, our
execute, we, we interweave two consumers

19
00:01:39,631 --> 00:01:46,481
executing the same instruction, from one
consumer thread and the other consumer

20
00:01:46,481 --> 00:01:53,330
thread every other cycle. So we have, the
first consumer will read the head pointer

21
00:01:53,330 --> 00:01:58,288
into the head. The second consumer will
read the head pointer into its own head

22
00:01:58,288 --> 00:02:06,144
register. This consumer here will read the
tail pointer into, the tail register. The

23
00:02:06,144 --> 00:02:12,525
second consumer will do that same thing.
So all of a sudden the two consumers read

24
00:02:12,525 --> 00:02:17,871
the same heads and the same tail. And this
is sequentially consistent cuz we've not

25
00:02:17,871 --> 00:02:23,089
done run, done no reordering between
within the thread. we've just done some

26
00:02:23,089 --> 00:02:27,946
valid interlead. Now they both check this.
This is the, the case which is basically

27
00:02:27,946 --> 00:02:31,454
saying is there anything in the queue.
Does the head equal the tail? If the head

28
00:02:31,454 --> 00:02:34,962
equals the tail there's nothing in the
queue. If there's a distance between the

29
00:02:34,962 --> 00:02:39,064
head and the tail that means there's some
entries in the queue So they both fall

30
00:02:39,064 --> 00:02:43,799
through an they say, well yeah. There's,
there's stuff in the queue. The next thing

31
00:02:43,799 --> 00:02:48,114
they do is they go read from the head.
This is the value that they are going to

32
00:02:48,114 --> 00:02:55,006
use. Well they both just read the same
head pointer. Because we are gonna execute

33
00:02:55,006 --> 00:03:00,220
from one thread and we're going to load
that value into another thread. So all of

34
00:03:00,220 --> 00:03:05,628
a sudden we just read the same value twice
and one is the one thread and one is the

35
00:03:05,628 --> 00:03:10,777
other thread but it was the same, same
value that we get out of reading from the,

36
00:03:10,777 --> 00:03:16,470
the pointer that points to the, the, the
head if you will. Now. In a perfect world,

37
00:03:16,470 --> 00:03:21,215
that would be the only bad thing that
could happen in this case cuz we would

38
00:03:21,215 --> 00:03:25,774
incorrect, they'd both incorrect the
headpointers, then, then they do stores

39
00:03:25,774 --> 00:03:31,196
here and effectively what would happen is
they would write the same head location

40
00:03:31,196 --> 00:03:37,408
into the, the, or the same pointer if you
will into, into the head. That's not so

41
00:03:37,408 --> 00:03:41,251
bad. But effectively what we did is we
enqueued one thing into the queue, and

42
00:03:41,251 --> 00:03:45,399
then we got two things out of the queue.
This breaks our semantics of what a queue

43
00:03:45,399 --> 00:03:52,034
should actually do. Something that could
happen that's much worse here. Is that one

44
00:03:52,034 --> 00:03:56,878
of the threads, let's say another valid
interweaving, is the one thread just

45
00:03:56,878 --> 00:04:04,310
stalls for a long time in here. And, like
a billion cycles, it just falls for a

46
00:04:04,310 --> 00:04:09,450
billion cycles. It takes really bad
cachmus or it traps into the operating

47
00:04:09,450 --> 00:04:14,657
system, or something bad happens to it.
What can happen is before the update at

48
00:04:14,657 --> 00:04:20,632
the headpointer happens, or what could,
what could have happened here is you could

49
00:04:20,632 --> 00:04:25,742
have something really strange happen. Or,
or, you could basically have the other

50
00:04:25,742 --> 00:04:31,115
thread execute, and it could pop a whole
bunch of values off the head of the queue

51
00:04:31,115 --> 00:04:36,291
in the meantime. So let's say there was
100 thi ngs in the queue. We both execute

52
00:04:36,291 --> 00:04:42,062
into here at the same time. One of the
threads get stalled here for a long period

53
00:04:42,062 --> 00:04:46,896
of time. The other thread pops 100 things
off the queue. But then finally, this one

54
00:04:46,896 --> 00:04:51,133
here goes and updates the head pointer
again. So what's gonna happen, is

55
00:04:51,133 --> 00:04:56,027
effectively you are basically going to
reset the queue and put 100 things back

56
00:04:56,027 --> 00:05:00,563
onto the queue. And, it's probably going
to be garbage in memory, in the queue at

57
00:05:00,563 --> 00:05:09,801
that time. So this brings up the idea that
we may not want to be concurrently

58
00:05:09,801 --> 00:05:15,305
executing even the sequentially, even the
strict sequentially consistent model. This

59
00:05:15,305 --> 00:05:20,876
block of code and another threads piece of
code which is executing the same, same

60
00:05:20,876 --> 00:05:26,049
thing, the, the two consumers. So we're
going to call this a critical section. if

61
00:05:26,049 --> 00:05:31,488
you take an operating system's class you
will seen this, seen this a bunch. And the

62
00:05:31,488 --> 00:05:36,860
idea here is that you want to atomically
execute from all the consumers possible.

63
00:05:36,860 --> 00:05:41,375
This block of code here. So no two
consumers could begin to execute that

64
00:05:41,375 --> 00:05:49,817
block of code at the same time. And one
way to do that is you have a lock on that

65
00:05:49,817 --> 00:05:57,127
piece of code. So you grab a lock, you can
go execute that piece of code, and then

66
00:05:57,127 --> 00:06:07,718
you release the lock. Questions so far?
Okay, so a little bit more general notion

67
00:06:07,718 --> 00:06:15,798
of the same idea of a lock is called a
semaphore. And if you've taken !!! class

68
00:06:15,798 --> 00:06:23,772
you've probably seen these strange
nomenclatures of P and V semaphores. And

69
00:06:24,091 --> 00:06:32,171
if you want to go and look up where it
actually comes from, it's Dutch. And P

70
00:06:32,171 --> 00:06:40,173
largely means attempt to acquire the
semaphore. But, in, in, in Dutch it means

71
00:06:40,445 --> 00:06:47,235
I don't speak. Does anyone here speak
Dutch? By chance? Anyone speak anything

72
00:06:47,235 --> 00:06:54,930
close to Dutch ? You do? Little bit? Okay,
how do we, how do we, how do we say that?

73
00:06:54,930 --> 00:07:03,846
What, what is close to Dutch actually?
German's close to Dutch? Yeah. Okay. I

74
00:07:03,846 --> 00:07:10,749
guess it's. so I don't, I don't speak
Dutch or German, but it means try to

75
00:07:10,749 --> 00:07:17,100
reduce. In a semaphore you can have
multiple, people trying to grab a lock at

76
00:07:17,100 --> 00:07:21,930
the same time and multiple people trying
to caching to aquire that lock at the same

77
00:07:21,930 --> 00:07:26,645
time. So an example, or aquire for the
same time. The example that we gave in the

78
00:07:26,645 --> 00:07:31,532
last class is that you have multiple users
trying to use one I/O device, and let's

79
00:07:31,532 --> 00:07:36,788
say that there are two queues in the I/O
device. You have 100 different processors

80
00:07:36,788 --> 00:07:41,860
or threads trying to use that one I/O
device. You can actually have strictly

81
00:07:41,860 --> 00:07:46,999
only two acquire the I/O device, but not
more than two. In a Mutex, you're only

82
00:07:46,999 --> 00:07:52,138
allowed to have one thing be able to
acquire the critical section or acquire

83
00:07:52,138 --> 00:07:57,543
the lock at a time. And, hence, we have a
little bit more general term for this. A P

84
00:07:57,543 --> 00:08:04,216
semaphore basically, decrements a counter.
So we start off with a counter S, here. We

85
00:08:04,216 --> 00:08:11,173
put some number into it, and then we do P
on that, it'll actually decrease the

86
00:08:11,173 --> 00:08:17,927
counter, and if the counter was greater
than zero, we can go execute our Kregel

87
00:08:17,927 --> 00:08:25,657
section. Then we have the release for the
V semaphore. And that will actually

88
00:08:25,657 --> 00:08:32,416
increment this counter by one and it'll
wake up some other process if there is

89
00:08:32,416 --> 00:08:39,602
some other process waiting on that. if,
if, if, I said dropped below, one it'll go

90
00:08:39,602 --> 00:08:45,200
wake up another process . And, one of the
important things here is that these, both

91
00:08:45,200 --> 00:08:49,286
these P and these V semaphores must be
atomic, relative to each other. So however

92
00:08:49,286 --> 00:08:53,424
you go, thinking about going to implement
them, they must, they must be relative to

93
00:08:53,424 --> 00:08:57,459
each other. And this is, with respect to
everything happening, everything else

94
00:08:57,459 --> 00:09:01,239
happening on your system. Otherwise,
really, really bad things will start to

95
00:09:01,239 --> 00:09:06,046
happen. You'll actually, have multiple
people grabbing the lock. And how you

96
00:09:06,046 --> 00:09:11,698
would typically use this is inside of your
thread or your process. You're going to

97
00:09:11,698 --> 00:09:18,396
put your critical section in, put the
acquire here, and the release there. And

98
00:09:18,396 --> 00:09:23,866
one of the important things that, here to
note is these semaphores. You can have the

99
00:09:23,866 --> 00:09:29,473
value S here determine how many the, the
initial value of S rather determines how

100
00:09:29,473 --> 00:09:34,804
many different threads or processes are
able to enter the curricular section

101
00:09:34,804 --> 00:09:40,481
concurrently. If S is initially such one
that means one only one thread is allowed

102
00:09:40,481 --> 00:09:46,020
to go in there and that's what's called a
Mutex. so it's mutually exclusive, only

103
00:09:46,020 --> 00:09:49,966
one thread can go into a curricular
section at that time.