WEBVTT

1
00:00:49.890 --> 00:00:50.700
Jeffrey Osier-Mixon: Okay, we are recording

2
00:00:51.000 --> 00:00:51.420
Okay.

3
00:00:52.560 --> 00:01:03.720
Ted Marena: Well welcome everybody to the risk five Bay Area Meetup. I would expect some of us may not be in the Bay Area, because of the virtual nature of the event.

4
00:01:04.380 --> 00:01:20.790
Ted Marena: This is a presentation on cash coherent memory fabric which is based on influence on risk five. My name is Ted Miranda and I'll be just doing a really quick introductory. Let me just share my screen. I just have a couple of slides.

5
00:01:22.140 --> 00:01:26.370
Ted Marena: So let me see if I can do that over here. So,

6
00:01:27.690 --> 00:01:34.530
Ted Marena: For those of you who are, you should be able to see my screen it says risk five with these bubbles. Yes.

7
00:01:35.040 --> 00:01:36.000
Ted Marena: Okay, great. So

8
00:01:37.020 --> 00:01:45.930
Ted Marena: Risk five. For those of you who are not aware. I think a lot of you are familiar with what it is. It's basically an instruction set architecture. It sets the specifications.

9
00:01:46.170 --> 00:01:58.650
Ted Marena: For what you can do to a processor core, it allows you to, you know, really have full control and the real key here is the openness of the specification

10
00:01:59.130 --> 00:02:13.410
Ted Marena: And the other part of the openness are of the specified specification. Is that you, you know, it just basically sets. What you need to do. It doesn't actually talk about the implementation or how to do it. So,

11
00:02:14.610 --> 00:02:28.980
Ted Marena: You know, really, that's where the group chips alliance where we're going to be talking about tile link tile link is actually a boss and interface based off of risk five and

12
00:02:30.210 --> 00:02:38.520
Ted Marena: We're also going to be talking about how Ty link has been extended and that's what yields on the extend and

13
00:02:41.610 --> 00:02:58.140
Ted Marena: The Intel Tofino switch is what we're using to do demonstrations and proof of concept for this capability. So what's really neat is that this is a an architecture that's really been influenced and enabled by risk five

14
00:02:58.590 --> 00:03:05.730
Ted Marena: So the group that all this works being done is called chips alliance. You can see number of organizations that are part of the group.

15
00:03:06.510 --> 00:03:11.370
Ted Marena: The organization develops open source hardware as well as software development tools.

16
00:03:11.910 --> 00:03:21.180
Ted Marena: And there's a number of work groups that are part of this. And so what we're going to be focusing on is really the interconnect workgroup

17
00:03:21.540 --> 00:03:34.650
Ted Marena: And two particular applications or developments within that is tile link as well as on the extend. So there's also a triple interface called AIB that's part of our interconnect.

18
00:03:35.460 --> 00:03:47.910
Ted Marena: That's also an Intel supported and sponsored project. So lots of different activities. You can see we have course work group and we have software tools work group and so on.

19
00:03:48.660 --> 00:03:54.060
Ted Marena: So chips alliance. If you want to know more, you can go to the chip science.org website.

20
00:03:54.990 --> 00:04:04.650
Ted Marena: The whole idea of the group is it's an open no barrier collaboration allows companies to share development resources lowers the cost of development.

21
00:04:05.100 --> 00:04:12.990
Ted Marena: And it can also provide a red hat model you can introduce something that's open and then monetize it put support services around it.

22
00:04:13.800 --> 00:04:32.490
Ted Marena: So that's the end of my presentation, I'm going to stop my share. I do want to let everybody know if you have questions, there is a Q AMP a box that I'll keep an eye on and monitor and then you can also, I believe there's like a place for you to raise your hand so

23
00:04:33.540 --> 00:04:37.890
Ted Marena: Jeff roe, I think, is there anything else that I should say before we hand the baton over this van Amir

24
00:04:39.000 --> 00:04:39.990
Jeffrey Osier-Mixon: No, I don't think so.

25
00:04:40.350 --> 00:04:41.850
Zvonimir Bandic: I think vessel is being first though.

26
00:04:42.840 --> 00:04:46.710
Ted Marena: Oh, sorry. Yes. Wesley. I'm sorry. Yeah. So Wesley, why don't you

27
00:04:47.850 --> 00:04:48.540
Ted Marena: Take over

28
00:04:49.800 --> 00:04:51.660
Ted Marena: And you should be able to do this.

29
00:04:57.510 --> 00:04:59.820
Jeffrey Osier-Mixon: Looks like Ted is frozen, but you can go right ahead.

30
00:05:05.220 --> 00:05:05.790
Jeffrey Osier-Mixon: Frozen now.

31
00:05:06.690 --> 00:05:07.410
Ted Marena: Are you there.

32
00:05:12.330 --> 00:05:13.530
Ted Marena: Yes, you're on mute.

33
00:05:18.180 --> 00:05:19.740
Jeffrey Osier-Mixon: He says he cannot unmute

34
00:05:20.130 --> 00:05:22.830
Jeffrey Osier-Mixon: Oh give me one second to fix that.

35
00:05:24.330 --> 00:05:27.390
Ted Marena: Okay, that would be difficult if he

36
00:05:27.420 --> 00:05:28.800
Ted Marena: Cannot tell you so.

37
00:05:28.950 --> 00:05:29.730
Jeffrey Osier-Mixon: Now, we can hear you.

38
00:05:29.940 --> 00:05:30.240
Okay.

39
00:05:32.460 --> 00:05:37.140
Wesley Terpstra: I don't know. I talked earlier, it worked. But again, anyway, let me just share my screen.

40
00:05:40.650 --> 00:05:45.030
Wesley Terpstra: And I hear you go, you can see me on the beach as well. Okay.

41
00:05:46.500 --> 00:05:53.910
Wesley Terpstra: So hi, I'm wessling I'm from see five and I was asked to give you guys a brief introduction to what talent is

42
00:05:55.500 --> 00:06:04.800
Wesley Terpstra: We use talent internally at sci fi for most of our shipping or connect and what we don't have to interface with other companies, obviously.

43
00:06:05.940 --> 00:06:21.630
Wesley Terpstra: But I'm going to give a quick overview how what it is, what it does, how it works, and what the future holds. So okay, so what is tiling. It's a pretty easy one. It's just a protocol for connecting masters like course was like like memory so

44
00:06:25.020 --> 00:06:30.870
Wesley Terpstra: That's it in a nutshell. Um, there are a lot of different bosses like that. One of the things that makes tying someone special is

45
00:06:31.470 --> 00:06:36.120
Wesley Terpstra: We've gone out of our way to make the protocol have a very simple message vocabulary.

46
00:06:36.540 --> 00:06:52.590
Wesley Terpstra: That works across different levels of complexity. So you can use the same protocol in very simple slave devices, but you can also use it in cash coherent situations where you need to transfer both the data and the ownership of blocks back and forth.

47
00:06:55.620 --> 00:07:05.190
Wesley Terpstra: So just to like walk you through sort of like the basics of this, the idea is that you have agents which are you know components in your SSE they need to talk to each other.

48
00:07:05.760 --> 00:07:12.510
Wesley Terpstra: And they can either act as a master Westlife because the talent protocol. It's a master slave protocol it on the front connects two agents together.

49
00:07:13.260 --> 00:07:20.730
Wesley Terpstra: And you move data across the link that connects them. So an example would be the master might say, I want to do a right, that's a quick message.

50
00:07:21.240 --> 00:07:29.580
Wesley Terpstra: And then the slave upon receipt of that message will respond with a access acknowledgement saying whether or not the second the the right succeeded.

51
00:07:30.270 --> 00:07:36.420
Wesley Terpstra: And of course you can use acknowledgments to do ordering of your operations and so on. But that's sort of in the weeds. We're not going to talk too much about

52
00:07:37.530 --> 00:07:44.520
Wesley Terpstra: You can also, of course, who gets access data. So at the simplest level of conformance you only have to support these two operations. That's it.

53
00:07:44.880 --> 00:08:01.320
Wesley Terpstra: So if you're writing like a spy controller or something. It just has to take the reason send up the data. Take the rights and write the data that much more the that's the URL here. So to obviously for tiling you L is for

54
00:08:03.000 --> 00:08:09.180
Wesley Terpstra: Man, I forgot I think oh yeah us for uncashed and Alice for lightweight. So this is the simplest version uncashed lightweight

55
00:08:09.570 --> 00:08:23.370
Wesley Terpstra: And then you have the uncashed heavyweight version, which adds atomic operation hints and a few other things that is important. So let's just take a look at what it looks like when you're talking about just the simplest level of talent, where you have uncashed

56
00:08:24.570 --> 00:08:29.370
Wesley Terpstra: Transactions, so get some points. So in this picture. You've got your agents visa the bubbles.

57
00:08:29.940 --> 00:08:36.150
Wesley Terpstra: And they're all connected by edges. Those are the tiling gauges. Notice in picture we always point from the master to the slave

58
00:08:36.540 --> 00:08:43.050
Wesley Terpstra: So tiling defines what goes on this wire well on this this this interconnect between the mastering the slave

59
00:08:43.530 --> 00:08:53.250
Wesley Terpstra: To make it just really clear. I've just label all of the edges. You can see every agent here acts as a master or slave on every edge a single master could be

60
00:08:53.760 --> 00:08:58.560
Wesley Terpstra: Sorry, a single agent could be both a slave on one time, Link. Link, and a mastering different time like link.

61
00:08:59.130 --> 00:09:10.860
Wesley Terpstra: Or in the case of like this router here. It can be a slave on two links in the mastermind two links so agents are connected multiple to multiple tiling links, potentially, and they can take on either the master or slave role or above.

62
00:09:12.180 --> 00:09:23.790
Wesley Terpstra: The talent protocol. Again, it's only talking about what happens on these edges that connect the agents. We don't tell you what you have to do inside an agent. That's the micro architecture that's up to the designer so

63
00:09:25.560 --> 00:09:31.920
Wesley Terpstra: Here's just an example, say the processor wants to do a read it can issue a get message to its immediate neighbor.

64
00:09:32.400 --> 00:09:42.840
Wesley Terpstra: So it says from the master the cores. The master ISSUES TO GET request to the slave and then the slave is force under obligation to answer that guy, but he might, you know, send it through the network.

65
00:09:43.890 --> 00:09:52.980
Wesley Terpstra: Specially to get messages sent over each of these talent links till eventually gets the memory. And then of course the memory can't respond with the data and

66
00:09:53.460 --> 00:10:01.620
Wesley Terpstra: The answers come back. So what we've seen here is we generally drawing the picture is master slave points for master slave, but sometimes the message is going the opposite direction.

67
00:10:02.730 --> 00:10:07.350
Wesley Terpstra: Okay, so I mean that's pretty straightforward. I think everyone's familiar with bosses like that.

68
00:10:08.730 --> 00:10:16.260
Wesley Terpstra: There is one thing, though, that you might want to improve. So get some quotes essentially the ownership of the data. The slave. What do I mean by that, I mean,

69
00:10:17.010 --> 00:10:25.380
Wesley Terpstra: So it's very simple that like if you're a master and you send a request out firstly, you're only sending 211 person that your slave connection, obviously.

70
00:10:25.680 --> 00:10:29.700
Wesley Terpstra: But if you're like an interconnect like that river in the picture here, and you have to decide.

71
00:10:30.180 --> 00:10:40.080
Wesley Terpstra: Message came in, where do I send it. It's really simple. You just know where to send it because it's whoever owns that address that's where it goes. So it's very easy to like route out when you're only having get some points.

72
00:10:40.650 --> 00:10:46.230
Wesley Terpstra: And also it's very easy to resolve any kind of ordering or race conditions because if you're a slave. That's memory.

73
00:10:46.560 --> 00:10:51.150
Wesley Terpstra: The requests come in and when they arrive at you, the memory you decide the order of the operations.

74
00:10:51.390 --> 00:11:05.100
Wesley Terpstra: So this is a really simple model work with and it's pretty decent. If you're doing like high throughput latency insensitive kinds of operations like the MA. So like if you're streaming a lot of right data to or from like a disc or the network. This is a pretty good model.

75
00:11:06.240 --> 00:11:17.070
Wesley Terpstra: But obviously, I wouldn't be talking if there wasn't something here to better. The question is, what if the data is far away. Perhaps you know across the chip or even over the network and you want to use it more than one time.

76
00:11:18.750 --> 00:11:33.150
Wesley Terpstra: And that so also that if you get if you're in that situation. Those repeated accesses my waist traffic and weeks and weeks power, but particularly for processors inside sci fi. Of course we we sell processors. So we care a lot about this scenario.

77
00:11:34.470 --> 00:11:44.010
Wesley Terpstra: Process or processes are quite sensitive to latency. So if you do like a load on a processor and a lot of instructions depend on it. You want to get your answer quite quickly. So

78
00:11:44.520 --> 00:11:51.990
Wesley Terpstra: That really hurts performance that you can't answer the query without crossing the chip. So that's where we come to transfer ownership.

79
00:11:52.560 --> 00:12:00.090
Wesley Terpstra: So talent lets you move where the block is on. So just a quick terminology thing a block has just a chunk of memory.

80
00:12:00.630 --> 00:12:10.770
Wesley Terpstra: Typically 64 bytes or no hundred 28 are in this range, right, just a chunk of memory and that check a memory can move throughout the system like who is responsible. So, and

81
00:12:11.010 --> 00:12:16.770
Wesley Terpstra: Again, in the simplest scenario we're talking about initially. It's always the slave owns the catch block.

82
00:12:17.040 --> 00:12:25.650
Wesley Terpstra: So if the master wants to perform operations as a service request to the slave the slave does whatever the thing was like get her report and it gives the answer back

83
00:12:26.130 --> 00:12:34.860
Wesley Terpstra: So that's in that scenario, the only person with a copy of the the block is the slave. That's the unique case right only one agent has it. And it's this life.

84
00:12:35.700 --> 00:12:42.780
Wesley Terpstra: You can also when you have data movement. And sorry, an ownership movement. You can move that relationship and a cash so

85
00:12:43.050 --> 00:12:53.130
Wesley Terpstra: The master can become the person who has the only unique copy and then he's able to do in his reads and writes without having to go across the link. So he's taken the block from the slave and he owns that block.

86
00:12:54.390 --> 00:12:59.130
Wesley Terpstra: And if a slave wants to do an operation. It's not allowed to anymore has to ask for the plug back from the master.

87
00:12:59.700 --> 00:13:09.570
Wesley Terpstra: And finally, there's this case where you can have it shared. So both the slave and the master have a copy, and they can both there for service reads, but only. Well, neither of them can

88
00:13:09.990 --> 00:13:17.160
Wesley Terpstra: Perform right until a change in ownership happens that's that one of the sides is unique. So that's the way tiling ownership operates.

89
00:13:17.670 --> 00:13:22.740
Ted Marena: Leslie, just a hold on a second. I just wanted to see if you might be able to answer a question that came in on the chat.

90
00:13:22.950 --> 00:13:24.600
Wesley Terpstra: Yeah, I was trying to figure how to work that and

91
00:13:24.600 --> 00:13:28.110
Ted Marena: Okay, yeah, I mean, do you want to just answer it live really quick. All right.

92
00:13:28.620 --> 00:13:33.660
Ted Marena: Go ahead. The question is are there any restrictions and how quickly the sweat slave needs to respond to a request.

93
00:13:33.840 --> 00:13:48.900
Wesley Terpstra: Now the talent protocol explicitly forbid you from in fact building timers into the protocol, because the we're going to come to this later, but the talent protocol if you build your system conforming to the requirements of the specification. The system cannot deadlock.

94
00:13:49.260 --> 00:13:50.640
Wesley Terpstra: So if you have timeouts.

95
00:13:50.880 --> 00:13:58.080
Wesley Terpstra: In a system for slow responding slave, you can win situation for you like answer a message twice like

96
00:13:58.560 --> 00:13:59.760
Wesley Terpstra: send the request to slave

97
00:14:00.300 --> 00:14:11.190
Wesley Terpstra: The interconnect the side the slave is down and responds with, like, an air and then later the slave response to the actual answer. Those are really difficult problems to deal with in a real system. So actually the timing protocols, just like

98
00:14:11.640 --> 00:14:16.650
Wesley Terpstra: Flat out, you're not allowed to put timers and unless you're interfacing with the devices, not to be buggy, and if you do

99
00:14:16.740 --> 00:14:19.650
Wesley Terpstra: The timer cut that device off completely. So

100
00:14:19.920 --> 00:14:24.540
Wesley Terpstra: There is no baked into the protocol time limit. In fact, the opposite.

101
00:14:25.530 --> 00:14:33.720
Wesley Terpstra: But in practice, obviously, you know, the, the time it takes respond. It's going to depend on the memory, you're talking to, if it's something on ships and he'll probably hundred nanoseconds DDR something

102
00:14:34.200 --> 00:14:38.460
Wesley Terpstra: You know, cash if you're not going to cash, you know, less than. And then the second you're going out to

103
00:14:39.420 --> 00:14:41.490
Ted Marena: Leslie just sorry. Keep going. We just wanted

104
00:14:41.490 --> 00:14:42.660
Ted Marena: To get that quick answer.

105
00:14:42.750 --> 00:14:44.370
Wesley Terpstra: Part three. I got my side.

106
00:14:45.630 --> 00:14:47.490
Wesley Terpstra: I moving on. So,

107
00:14:47.520 --> 00:14:52.740
Wesley Terpstra: So how can ownership, the exchange. So we've seen in this picture that the ownership can move from the slaves. The master or be shared.

108
00:14:53.160 --> 00:14:59.520
Wesley Terpstra: So to do that we need to have messages that can do it. So the, the operations can be initiated by Master or slave

109
00:15:00.000 --> 00:15:08.310
Wesley Terpstra: And the operation might be causing the master to increase his ownership or the slave increases ownership. So for this sort of quadrant four things. There's

110
00:15:08.700 --> 00:15:22.230
Wesley Terpstra: Obviously for possible messages. So the choir message is a message and master sends to obtain the block. So in this picture. That would be like you started with the block owned by the slave master issues and acquire and then end up with the master owning the block.

111
00:15:23.400 --> 00:15:32.190
Wesley Terpstra: Similarly, you could have a master initiated release. This is really common and caches where you want to pull something new in but you don't have space, you have to get rid of something I already have

112
00:15:32.580 --> 00:15:38.850
Wesley Terpstra: So that's where you like I have this block. I don't want any more here slave, you can have it back again. That's a release message.

113
00:15:39.720 --> 00:15:47.760
Wesley Terpstra: Another really important one is probes. This is where you have a say in interconnectivity. For example, this router here say

114
00:15:48.120 --> 00:15:55.530
Wesley Terpstra: At one core takes the block and the other core wants to perform and access. So he asked the router for it. The rotor hospital to get it back. So that's the probe.

115
00:15:55.950 --> 00:15:59.940
Wesley Terpstra: For that you can send a request back to a master and be like, Hey, give me that block back

116
00:16:00.930 --> 00:16:10.050
Wesley Terpstra: So the slave can recover its ownership. There is actually a fourth potential thing that very few protocols implemented tiling also doesn't currently not this, although there has been talk of potentially having it.

117
00:16:10.380 --> 00:16:18.300
Wesley Terpstra: This is the so called stash. This would be were a slave decide. So the master should actually have a block. An example might be like a really starts

118
00:16:18.780 --> 00:16:27.420
Wesley Terpstra: A really smart storage controller that has seen many access patterns and you knows that if you read this blog and this block. You're going to want that block next sort of like a reverse Prefecture.

119
00:16:27.750 --> 00:16:38.970
Wesley Terpstra: And so when the season XCOR asking and say, like here have this block. But regardless, these, these messages here to the choir releasing prob supporting for those, that's what brings you to the third conformance level of talent talent cached.

120
00:16:40.260 --> 00:16:48.930
Wesley Terpstra: Okay, so how do you actually build a boss like this and you know a bunch of things you need to know the types of messages that are sent we talked about all the important messages in Thailand now.

121
00:16:49.500 --> 00:16:54.810
Wesley Terpstra: I'm using some of the field definitions, things like you know the size of the block being transferred was scented

122
00:16:55.530 --> 00:17:00.840
Wesley Terpstra: What's the address so on. You can read the spec for that sort of stuff really important aspect of any

123
00:17:01.320 --> 00:17:07.380
Wesley Terpstra: Bus protocol and particular cash coherent bus protocols is order and what things happen in what order.

124
00:17:07.770 --> 00:17:21.390
Wesley Terpstra: Like what messages may wait on other messaging what messages must wait on other messages. So in Thailand that's that's dealt with by like we have five priority levels that we defined in the spec and every message has a specified priority and

125
00:17:22.530 --> 00:17:24.390
Wesley Terpstra: You are not allowed to wait on

126
00:17:25.560 --> 00:17:35.640
Wesley Terpstra: lower priority message to serve as a higher priority one in a nutshell. And if you follow those rules then like I said earlier, actually you can prove that a composed system of

127
00:17:36.180 --> 00:17:50.040
Wesley Terpstra: correctly implemented tiling blocks cannot deadlock, there's one last thing you need to define, of course, which is how do you code the message on the wire. The nice thing with talent is you consider lots of lots of different ways. And in fact we have done so. So this is the

128
00:17:50.070 --> 00:17:51.270
Ted Marena: Nicely. Hold on a quick second

129
00:17:51.810 --> 00:17:54.900
Ted Marena: Can you address the question that came in is, it

130
00:17:55.020 --> 00:17:55.830
Wesley Terpstra: I don't see them.

131
00:17:56.100 --> 00:17:57.960
Ted Marena: Okay, yeah. So I'll ask it.

132
00:17:58.470 --> 00:18:08.790
Ted Marena: If they're if there's multiple masters across the network. What tile link. What will tiling say about memory coherence if masters don't have the same latency.

133
00:18:10.080 --> 00:18:16.140
Ted Marena: I think that's the question. And that may not be for tiling maybe that is an omni extend question.

134
00:18:16.200 --> 00:18:22.710
Wesley Terpstra: Oh, well, it's just a question for both, I think, since I'm next. In this talent network, but

135
00:18:24.660 --> 00:18:26.580
Wesley Terpstra: The answer okay so

136
00:18:28.200 --> 00:18:39.300
Wesley Terpstra: So if you have mastered at different latency. I mean, that's sort of like a new system. Right. So, for example, my desk right now I've got four VCU one on FPGA is have on extended if one of the FPGA.

137
00:18:39.870 --> 00:18:45.540
Wesley Terpstra: Has a cash block and a different one wants it. Well, then there is a high latency penalty for get that block back

138
00:18:46.020 --> 00:19:02.130
Wesley Terpstra: So accessing so bouncing a cash block between distant masters does take longer opposite than bouncing catch block to the masters that are on the same ship. So if you have to move a cache block a long distance, it will take longer. I mean, yeah.

139
00:19:04.080 --> 00:19:04.560
Ted Marena: Thanks.

140
00:19:04.980 --> 00:19:16.500
Wesley Terpstra: So the this picture here is a picture of the contract protocol we use in this 1.9 spec or 1.8 spec. I can actually see mounts like at 1.8 spec.

141
00:19:17.430 --> 00:19:25.260
Wesley Terpstra: So in this spec. There's this is showing like an example get message. I'm not going to go into the details. If you care about this serialization, you can read the spec.

142
00:19:26.250 --> 00:19:37.680
Wesley Terpstra: Somewhat relevant as you see the the priority here on the left, that's letter A and D. So when you're doing just straight normal tiling uncashed. So tell us, or to you. Ah,

143
00:19:38.790 --> 00:19:47.100
Wesley Terpstra: You only need two priorities. That's the AMD priority. So here we see like a get messages went out the Access Act data comes back and other gamma Cisco access data comes back.

144
00:19:48.600 --> 00:19:57.510
Wesley Terpstra: Right. So there's different ways you can see realize, Thailand, you can do it on the parallel with the the 1.8 spec specifies a parallel ready valid bus so that's

145
00:19:58.110 --> 00:20:12.750
Wesley Terpstra: That's this one here, this is the one we mostly use it internally at side by one chip. Some people have done things knock like where they take on ship the tiling messages, turn them into packets then send them over the knock

146
00:20:14.250 --> 00:20:15.510
Wesley Terpstra: The idea here is just

147
00:20:18.270 --> 00:20:26.490
Wesley Terpstra: Tiling as a lot of wires when it's parallel to keep god that's on this picture. But if you have like an address boss, you know, it's potentially like 40 or more bits wide and the date of us can be wide.

148
00:20:26.670 --> 00:20:32.880
Wesley Terpstra: So if you want to save on wires. Like you can turn it into a knock where it's sort of like a mini serialized protocol on chip.

149
00:20:33.450 --> 00:20:42.570
Wesley Terpstra: You can also do it off chip. So we'd like a packet based parallel credit US version that we call chip link that was in a product we shipped. I'm not sure how many years ago now for quite a while ago.

150
00:20:42.990 --> 00:20:48.120
Wesley Terpstra: About its proprietary and you could you could also do it over the network. So I refer you to the next talk about that.

151
00:20:49.680 --> 00:20:56.820
Wesley Terpstra: Here's a just a picture of like real chip and we've taped out. Not sure how long ago, quite a while, at this point.

152
00:20:58.140 --> 00:21:06.480
Wesley Terpstra: You can see that Scott like four cores here. Bunch of tiling switch. There's a cash er controller, a bunch of slaves.

153
00:21:07.170 --> 00:21:16.020
Wesley Terpstra: This is chilling thing where you can go over to another chip to prototype like IP you'd like to test your risk flags or sci fi chip. This is one of our chips. This is sci fi

154
00:21:17.340 --> 00:21:19.800
Wesley Terpstra: Freedom unleashed 514

155
00:21:21.480 --> 00:21:25.260
Wesley Terpstra: So you can see, though, that the whole thing is glued together by tiling gauges. So you've got

156
00:21:25.980 --> 00:21:41.670
Wesley Terpstra: Between the course. The course, he's a Linux capable of course these when they talk to the to the speaking the coherent version tiling on these edges here and the core is like a management court doesn't have caches all the cash doesn't really count that.

157
00:21:42.690 --> 00:21:47.520
Wesley Terpstra: You see I caches and chemical here. I didn't rest five usually. So this is just the uncashed protocol.

158
00:21:48.720 --> 00:21:52.470
Wesley Terpstra: When you get down the backside. The hell to you're just speaking the uncashed version again.

159
00:21:54.330 --> 00:22:00.360
Wesley Terpstra: Let me get at the chip link here. So the chorus can actually go and they can cache memory from another chip. So if this had

160
00:22:00.630 --> 00:22:07.680
Wesley Terpstra: Say a DDR controller here, you can actually pull that block all the way over here all the way to the other one caching. Whoops, have a core so

161
00:22:08.520 --> 00:22:20.640
Wesley Terpstra: Again, that's sort of the idea also on the extent right years you're going to take remote memory and you can use it in your own local cache. So even though normally to go over here on your first axis is like maybe two microseconds.

162
00:22:20.910 --> 00:22:25.320
Wesley Terpstra: When you're using a normal here, you're back into the, you know, low nanoseconds access time

163
00:22:26.430 --> 00:22:31.740
Wesley Terpstra: But yeah, that's it. All those going all these peripherals here and all speaking the TL UL protocol, because

164
00:22:32.910 --> 00:22:37.560
Wesley Terpstra: Why would you do more of these are easy peripherals just speak the simplest protocol. So in summary,

165
00:22:41.460 --> 00:22:46.080
Wesley Terpstra: I like to find a single message vocabulary. I didn't mention this earlier.

166
00:22:47.100 --> 00:22:49.140
Wesley Terpstra: I meant to but I got derailed the

167
00:22:50.160 --> 00:22:58.020
Wesley Terpstra: This is really helpful for interoperability, because you can just take like these different components that we've got with the different conformance levels of talent and just plug them together.

168
00:22:58.230 --> 00:23:12.150
Wesley Terpstra: And it works. And if any of you have worked with like amber protocols, but things like xi HP a chai. They all have different semantics like the messages are different. The protocol fields are different, the ordering rules are different. And so you end up with this really expensive.

169
00:23:13.560 --> 00:23:18.420
Wesley Terpstra: Conversion wherever you want to go from one protocol to another. So the nice thing about using Thailand is

170
00:23:19.140 --> 00:23:31.440
Wesley Terpstra: So one protocol. It just has sort of levels of conformance when you go between those levels, the semantics don't change. So the order semantics of the same the rules for one message responded with what messages don't change.

171
00:23:33.330 --> 00:23:38.310
Wesley Terpstra: You only, you don't have to worry too much about the impedance mismatch with the protocols.

172
00:23:38.700 --> 00:23:45.240
Wesley Terpstra: And because we have the three conformance levels. You only have to pay for what you need. Like here we had all these tiny peripherals. They could just use a simple protocol.

173
00:23:45.810 --> 00:23:52.050
Wesley Terpstra: And not pay law hard records are a lot of them, but here on the core is, you know, we want to pay hardware for caching. So we do

174
00:23:52.890 --> 00:23:58.740
Wesley Terpstra: I'm finally because there's this singular vocabulary doesn't mean there's only one way to transmitted

175
00:23:59.010 --> 00:24:08.820
Wesley Terpstra: And so that's where we have different ways of realizing, like we're talking about. You can do a parallel like a parallel boss will be addressing data in parallel, you can pack a ties it you can manage the transmission of

176
00:24:09.810 --> 00:24:14.130
Wesley Terpstra: Your message is pretty valid or credit based entrepreneurship. So that's it.

177
00:24:14.160 --> 00:24:15.540
Ted Marena: What question. One more question.

178
00:24:15.900 --> 00:24:21.090
Ted Marena: Goes tiling dash c support versus serialization for grant data messages.

179
00:24:23.700 --> 00:24:39.360
Wesley Terpstra: Burst cereal. I mean, yes, I mean, Grant data messages are almost always burst. So yes, are you asking me this question was directed about on the extent, in which case the answer is still yes all messages sent over on extend our sampling packet form so

180
00:24:39.870 --> 00:24:41.790
Wesley Terpstra: You're kind of comes first and then the data payload.

181
00:24:42.510 --> 00:24:51.090
Ted Marena: And then we had one other question. Let's see, do the agent nodes perform a handshake to decide which variant of the protocol to us.

182
00:24:51.600 --> 00:24:58.440
Wesley Terpstra: Know, so on ship you know this ahead of time, if you're going to do that would be a waste of hardware. So generally, you just

183
00:24:59.370 --> 00:25:03.810
Wesley Terpstra: You hook up agents that conform correctly. So let's sci fi. We have another technology we

184
00:25:04.320 --> 00:25:15.900
Wesley Terpstra: Have not talked about diplomacy, which does this sort of handshake at compile time so that you only put down hardware that you actually need. But there's no handshake. Once the chip is like built like

185
00:25:16.980 --> 00:25:21.780
Wesley Terpstra: That would be a total waste like a little slave, they had to like negotiate before they could turn on that would

186
00:25:21.780 --> 00:25:23.490
Dejan: Say, say, where's this is the end. Can you guys hear me.

187
00:25:24.060 --> 00:25:33.720
Dejan: I guess. Yeah. So this was probably a question relating to also to me, extend right so in case. So I'm extent it's a different situation. Yeah. So can you touch on that.

188
00:25:33.810 --> 00:25:35.010
Wesley Terpstra: Okay, sure, sure. So

189
00:25:36.240 --> 00:25:48.510
Wesley Terpstra: I was trying to answer the tiling question. I was timing. But yes, if it's on the extended you're plugging to gather chips that don't know, but each other, then yes, there is. We're working on and off the official name is still screech but at one point it was called that screech protocol.

190
00:25:50.010 --> 00:25:58.470
Wesley Terpstra: The idea there is that when you plan to chips together over on the extent they broadcast their capabilities with essentially device tree and then a coordinator

191
00:25:58.470 --> 00:26:01.110
Wesley Terpstra: Node assembles the the aggregate system.

192
00:26:01.320 --> 00:26:02.580
Wesley Terpstra: And then tells it running answer.

193
00:26:02.940 --> 00:26:12.090
Wesley Terpstra: So that's how discovery works when you are actually hot plugging hardware again in a chip, though. That's very different. There's no hot plugin it ships. So you wouldn't want to pay that off.

194
00:26:13.290 --> 00:26:13.530
Wesley Terpstra: Right.

195
00:26:13.560 --> 00:26:23.490
Ted Marena: And then the last question, which I think maybe an omni extend question as well. But the picture you showed the freedom unleashed had four cores. Can it be 16 or more

196
00:26:24.810 --> 00:26:26.820
Ted Marena: Talent better for cash coherence.

197
00:26:26.940 --> 00:26:36.450
Wesley Terpstra: Right, so, so the tiling protocol just tells you, like, what message you send back and forth. It doesn't have any. I mean, it doesn't say anything about micro architecture.

198
00:26:36.840 --> 00:26:42.030
Wesley Terpstra: If the question is how many cores, can you build that's more questions. My architecture. So

199
00:26:42.720 --> 00:26:48.210
Wesley Terpstra: You'll notice in this picture, we have a switch. What's not sending this picture to the bank. Here the bank del to has four banks.

200
00:26:48.720 --> 00:27:01.290
Wesley Terpstra: And if you have a switch, which is a crossbar from four to four, the cost of switch grows up quite radically. So, this particular micro architecture probably would not scale to, well, if you put 16 cores down right here. I got one hand, we would never build a system that way.

201
00:27:02.040 --> 00:27:10.590
Wesley Terpstra: We'd probably stamp out this whole gray box four times and then connect them with like an L three or another level of some kind of coherence management there.

202
00:27:10.800 --> 00:27:21.000
Wesley Terpstra: So the talent protocol says nothing about the limits. It's all about what your bike architecture is I don't think this micro architecture scales up to 16 cores in the gray box, but you can stamp out for the gray boxes. No problem.

203
00:27:21.420 --> 00:27:31.920
Wesley Terpstra: You can build a ring topology instead of a crossbar that connected things together. So, I mean, it's really scalability questions or micro architecture questions the tiling protocol stays the same, regardless

204
00:27:32.340 --> 00:27:48.210
Ted Marena: Okay, well, why don't we. Let's move on. So as Vladimir and Dan, why don't you guys take over the screen and Wesley, maybe you can look at the question and see if you can type an answer. So we do need to keep going here. Okay.

205
00:27:48.360 --> 00:27:49.200
Wesley Terpstra: I'll try and forget it.

206
00:27:49.590 --> 00:27:49.980
Okay.

207
00:27:52.110 --> 00:27:54.360
Ted Marena: So as bonomy. Are you going to share

208
00:27:58.830 --> 00:28:00.840
Jeffrey Osier-Mixon: Maybe some difficulty coming with me.

209
00:28:02.490 --> 00:28:03.270
Zvonimir Bandic: Am I sharing

210
00:28:03.540 --> 00:28:05.010
Ted Marena: Yes, we can see it now.

211
00:28:05.850 --> 00:28:06.810
Jeffrey Osier-Mixon: All right. Excellent.

212
00:28:07.230 --> 00:28:08.130
Alright, so

213
00:28:09.360 --> 00:28:13.440
Zvonimir Bandic: Good. A good evening, everybody. Thank you so much for coming to this Meetup.

214
00:28:14.490 --> 00:28:24.750
Zvonimir Bandic: Excited to talk about the Omni extend the fabric 20 minutes, probably not enough, but we'll kind of try to Dan and I will try to cover interesting things.

215
00:28:25.800 --> 00:28:30.360
Zvonimir Bandic: How this actually came about and then we'll share some exciting.

216
00:28:31.980 --> 00:28:35.970
Zvonimir Bandic: Experimental resolves the data and measurements with done on the on the extent

217
00:28:38.430 --> 00:28:45.510
Zvonimir Bandic: Um, so I'll start with data center CPU vision and recently announced the back plane reference

218
00:28:46.800 --> 00:28:50.490
Zvonimir Bandic: And then we'll jump into details of our

219
00:28:51.540 --> 00:28:54.810
Zvonimir Bandic: Architecture and implementation and some performance mode.

220
00:28:56.100 --> 00:28:56.760
Zvonimir Bandic: So,

221
00:29:00.510 --> 00:29:08.490
Zvonimir Bandic: So the vision of a data center of the future is a obviously multi threaded multi core CPU. The see

222
00:29:10.230 --> 00:29:15.510
Zvonimir Bandic: Importance of getting two out of order is five course that can support general purpose.

223
00:29:16.890 --> 00:29:20.730
Zvonimir Bandic: Operating System and software applications, but the, we believe that

224
00:29:21.990 --> 00:29:34.920
Zvonimir Bandic: Most important clinic activity for the CPU systems are those related to memory accelerators and this is basically how extend came out to be. We wanted to develop open source.

225
00:29:35.520 --> 00:29:45.510
Zvonimir Bandic: Interface open source implementation of the interfaces that can be used bring memory in large amounts of persistent memory to the system.

226
00:29:45.990 --> 00:29:57.150
Zvonimir Bandic: Was kind of the primary motivation along the AV obvious Lee discovered how interesting to actually bring cash coherent cores into the system.

227
00:29:57.540 --> 00:30:05.250
Zvonimir Bandic: That can be used for specialized workloads important machine learning and inference and allows those subsystems to

228
00:30:05.940 --> 00:30:17.790
Zvonimir Bandic: Share memory coherently with the main cores that are running us. So that's really, that's really how this kind of came to life. Many years ago, and

229
00:30:18.720 --> 00:30:27.810
Zvonimir Bandic: And the vision is really to allow then a large number of risk five compute nodes to connect to university shared memory exactly like when this picture.

230
00:30:29.010 --> 00:30:39.630
Zvonimir Bandic: The, the very much like the vision of using programmable switches the base version of the protocol will work with Ethernet switches, as well as some top of L to

231
00:30:40.770 --> 00:30:45.690
Zvonimir Bandic: Programmable switches allowed to do interesting things potentially bring more performance in the system.

232
00:30:46.560 --> 00:30:57.210
Zvonimir Bandic: And this kind of architecture enables memory appliances. So for those of you who have heard the about segregation and desegregation with storage where you can virtually move

233
00:30:57.810 --> 00:31:03.210
Zvonimir Bandic: Storage and associated with a specific node or pool, the storage for multiple nodes to present

234
00:31:03.960 --> 00:31:14.760
Zvonimir Bandic: To present shared storage. This is exactly kind of things that we envision possibly on the extent and we are interested in building memory appliances for this kind of system.

235
00:31:15.270 --> 00:31:34.590
Zvonimir Bandic: And obviously, they are also very interested in the earliest five coherent nodes that can offload the AI workloads and and one one amazing interesting thing here we are in the early days where we're hooking up Western Digital and see five hardware, but

236
00:31:36.390 --> 00:31:48.630
Zvonimir Bandic: Everything about the standard is an open source and a lot of implementation pieces are also available for example tiling implementation is is is available can be can be obtained at the chips allies. Get that

237
00:31:49.140 --> 00:32:05.700
Zvonimir Bandic: So this, this impressive will mean that multiple different vendors will be able to use on the extent and and the people and marketing managers will be able to envision cash coherent systems with equipment for different vendors that can share memory.

238
00:32:07.410 --> 00:32:21.120
Zvonimir Bandic: And this is a amazing board. It took us a while to to design this, our former CTO Martin think unveiled this this board. The at the

239
00:32:22.230 --> 00:32:34.500
Zvonimir Bandic: In the presentation and the risk five summit last year and we are bringing this board up in the lab, it's, it's not fully brought up yet Corona got this kind of slow, slow down.

240
00:32:35.190 --> 00:32:42.870
Zvonimir Bandic: As the total setup into Silla scopes and they use the prediction for the board kind of has almost hundred kilos. So it wasn't exactly portable

241
00:32:44.220 --> 00:32:50.580
Zvonimir Bandic: But, but we are working on it. And then we got at least you have the

242
00:32:51.930 --> 00:33:08.640
Zvonimir Bandic: Course FTP lanes and small form factor lanes up and running will will share the designs with the whole world. I think currently the designs are available to chips alliance members in the in the private in the private folders and and we plan to share the board as well.

243
00:33:10.260 --> 00:33:11.580
Zvonimir Bandic: So, so now

244
00:33:13.320 --> 00:33:27.360
Zvonimir Bandic: I want to jump into I'm going to let Dan explain a little bit of a background on Omni extend and show our demo and performance results, then I hope you are unmuted, and

245
00:33:27.900 --> 00:33:30.060
Dejan: I can try talking. He's my audio good

246
00:33:30.330 --> 00:33:31.530
Zvonimir Bandic: Yes. Perfect, yeah.

247
00:33:31.560 --> 00:33:37.680
Dejan: Okay, I'm not reading video because my audio is choppy from time to time. So I don't want to make it worse. Alright. So, hello everyone.

248
00:33:38.430 --> 00:33:53.640
Dejan: This is a slide that sort of shows a very high level vision of what on the extended entirely been and I imagined it on this for mostly you know this stuff, but just to make sure, if there are any folks who are not familiar with what a coherence protocol means this is sort of the

249
00:33:54.990 --> 00:34:01.620
Dejan: Coming from the direction of memory centric systems are building systems with large amounts of desegregating memory.

250
00:34:02.430 --> 00:34:07.980
Dejan: When you ask a random person on the street, how to build a system with, you know, a petabyte of main memory.

251
00:34:08.430 --> 00:34:15.930
Dejan: They'll typically draw you something like the first picture on top. I'm not sure if I can point them zoom. I don't think I can point. So I'll just described, where the pictures are

252
00:34:16.860 --> 00:34:25.950
Dejan: So the top picture with the with the DRAM and a DMA engine is typically how folks build systems with large amounts of memory that can be accessed from

253
00:34:26.370 --> 00:34:33.840
Dejan: More than say 16 cores, or whatever the largest core come two days. And this is a system where you have an Rd ma

254
00:34:34.380 --> 00:34:43.560
Dejan: Setup. So you have a nick with a DMA engine and then that Nick has some software that runs its and sets up some tables that translate your local vision of virtual memory to a remote

255
00:34:44.070 --> 00:34:55.950
Dejan: You know vision of a cloud address space and then you call these Salter functions to fetch chunks of memory that are typically you know Kilobyte, Megabyte sizes. That's this little

256
00:34:56.610 --> 00:35:07.710
Dejan: Lightning bolt that you see. So this is issues and mainly shift for us as a storage maker is that for certain kinds of new by the addressable storage has very little latency.

257
00:35:08.490 --> 00:35:19.260
Dejan: The, the whole process. So fetching and context switching. And so, and it's just takes the longer it takes on the order of several microseconds. And we have technologies that are in a microsecond range. So we need something faster.

258
00:35:20.220 --> 00:35:31.380
Dejan: Now a lot of you may have heard about these lists, such as Josie, and that's the picture on the right where you do away with the DMA engine and you have some sort of tables that translate your local address based on the cloud.

259
00:35:32.040 --> 00:35:37.200
Dejan: Addressing but you still have some software that manages the straight and in case of Jersey. This was called the librarian.

260
00:35:37.830 --> 00:35:52.800
Dejan: And there are other solutions out there. So this is again not what we mean by Omni extended by link. So the picture on the bottom is correct, right. The picture on the bottom means that you have unified coherence space, right. So all your CPUs and all your memory controllers.

261
00:35:54.000 --> 00:36:07.530
Dejan: Sit on the other side of some imaginary fabric and your page tables. So that's what the diagram shows there and your PhD was may not be local to your. Soc. Right. That's sort of the key thing here. So if you move on to the next slide.

262
00:36:12.690 --> 00:36:14.850
Dejan: I'm not. It's coming slowly.

263
00:36:18.480 --> 00:36:23.820
Dejan: It's coming very slowly. I think we can skip this one. How much, how much time do we have

264
00:36:23.940 --> 00:36:36.450
Zvonimir Bandic: Here we have about 1010 minutes so i think i think the audience will probably get most excited if you maybe start from the, from the data plan implementation slide. Unless you want

265
00:36:37.740 --> 00:36:38.850
Zvonimir Bandic: You know, one of the tour earlier.

266
00:36:39.090 --> 00:36:42.840
Dejan: Sure, yeah. So let's talk about this. Right. So can you

267
00:36:43.290 --> 00:36:44.190
Dejan: Maximize

268
00:36:44.640 --> 00:36:46.170
Zvonimir Bandic: The one day. One day data plane.

269
00:36:47.490 --> 00:36:48.270
Dejan: Yeah, that one that

270
00:36:50.550 --> 00:36:51.270
Dejan: Basically to

271
00:36:51.570 --> 00:36:52.650
Zvonimir Bandic: Get to the demo and

272
00:36:52.650 --> 00:36:59.760
Dejan: Yeah, okay. So before we jump into the details, right. So this is the actual bit format or the packet packing that we have

273
00:37:00.150 --> 00:37:05.100
Dejan: And I believe this is all the obsolete, because I think we've changed back to alto framing.

274
00:37:05.910 --> 00:37:11.100
Dejan: But the point here is that we've looked at are quite a number. So there was one question right, how do we

275
00:37:11.550 --> 00:37:17.130
Dejan: How do we choose Thailand. Right. So we didn't really choose styling. The problem was, there was nothing else out there.

276
00:37:17.730 --> 00:37:24.990
Dejan: And similar problem that we had with scaling tiling off chip is that there was really nothing out there that was open and not proprietary

277
00:37:25.470 --> 00:37:33.060
Dejan: And also that will be widely available without everybody having to make Association. And so essentially we converge on Ethernet.

278
00:37:33.750 --> 00:37:40.170
Dejan: And the reason being that that's pretty much the only fabric out there. It's completely open unencumbered and it's really widely available.

279
00:37:41.040 --> 00:37:46.800
Dejan: It does have some quirks. Right. There's no in order delivery and it's not a reliable fabric, but we can work around these things.

280
00:37:47.340 --> 00:37:59.550
Dejan: And then essentially what we do is we just package the styling messages into alto frames and we call that on the extent. So when somebody asks you what is on the extent it's essentially telling or you can

281
00:38:00.600 --> 00:38:17.280
Dejan: Now to present. One of the questions, it's certainly going to get asked is this Ethan at specific by no means right so this is styling cover anything is possible that we chose it and it just because you can use it without having to jump through too many hoops. So this slide shows

282
00:38:18.600 --> 00:38:23.610
Dejan: I think this is the obsolete 0.1 version of the header. So the new header is actually available on GitHub.

283
00:38:24.450 --> 00:38:33.480
Dejan: You can download the specification document and look up the fields. And if you happen to be using one of the smart switches the P for switch from barefoot networks. Now, Intel

284
00:38:34.320 --> 00:38:46.260
Dejan: You can actually parse the styling packets in the switch and you can do interesting things to them. So there was one of the great appeals of this method is that we could actually process coherence protocol in the switch on the fly.

285
00:38:48.120 --> 00:38:51.900
Dejan: Okay. Can we move on to perhaps the performance slide.

286
00:38:56.700 --> 00:39:05.490
Dejan: Yeah, this was an example of before their flesh by. So this is an example set up the women a lab we're probably don't want to describe. Yes. Let's talk about performance. So this slide here.

287
00:39:06.660 --> 00:39:14.430
Dejan: shows our measurements. By the way, you can reproduce this yourself. All you need to do is you need to buy two Xilinx eval boards and this is Alex.

288
00:39:16.260 --> 00:39:17.940
Dejan: Or that runs the see five

289
00:39:19.410 --> 00:39:26.280
Dejan: I believe this was before, and now it's you 74 cores. So the binaries are available from GitHub for download.

290
00:39:26.760 --> 00:39:33.630
Dejan: And what you can do is you can set this up in three different ways right so you can set it up as a single board. So that's the red line on this blog.

291
00:39:34.380 --> 00:39:39.480
Dejan: Or you can hook up to boards, back to back with an Ethernet cable that those are two green lines.

292
00:39:39.930 --> 00:39:49.080
Dejan: And then you can go through a switch right so you can basically connect two boards to a Ethernet switch just any of the shots, which works or you can use barefoot Tofino like in this block.

293
00:39:49.800 --> 00:40:00.210
Dejan: And these measurements are a little bit out of date because we're still running a 50 megahertz on the FPGA course. But essentially what they show is on the x axis is a random

294
00:40:00.780 --> 00:40:10.320
Dejan: Memory access test and the x axis shows the size of the window within which you were touching 64 by cache lines and then the Y axes are

295
00:40:10.920 --> 00:40:18.270
Dejan: Latency measured in milliseconds imaginary clock cycles. Again, the milliseconds look pretty bad, but keep in mind that we're running an RPG a 50 megahertz.

296
00:40:19.230 --> 00:40:32.100
Dejan: So what you see is on the left, right, the L one means that every test that you access within a 32 K window. Pretty much hits the cash right so you only see to clock cycle one latency.

297
00:40:32.820 --> 00:40:39.540
Dejan: And then as you transition to a larger window up to one megabyte you slowly start to hit Elko cash more and more. So you see

298
00:40:40.470 --> 00:40:48.330
Dejan: About 1,000,000,027 clock cycle latency for or sorry 22 second latency for I'll do and then beyond one megabyte you start hitting DRAM.

299
00:40:48.900 --> 00:40:56.670
Dejan: And so now things get interesting. Right. So the red line shows the latency to access local DRAM on the SOC on which the CPU cores.

300
00:40:57.570 --> 00:41:08.370
Dejan: The green lines show access to both local and remote DRAM, meaning that remote, meaning that you're on the other board right through an Ethernet cable beyond makes them vertical

301
00:41:08.880 --> 00:41:19.170
Dejan: And you see that these lines are quite close. The reason being that you're not getting the red line here for the local beer and is that every coherence request has to go across and check that the other cash doesn't have it.

302
00:41:19.950 --> 00:41:28.620
Dejan: And this was an early version of this protocol where we didn't have directory implementation. So essentially all requests have to check all the caches and obviously this is not scaling.

303
00:41:29.220 --> 00:41:35.760
Dejan: So, pretty soon we'll have a directory enabled implementation that's going to eliminate this checking all caches.

304
00:41:36.720 --> 00:41:46.500
Dejan: And then as you move to the blue lines. These are the same Layton sees as you when you go to the switch. You see that the switch ads about 1.2 microsecond store round trip checking

305
00:41:47.070 --> 00:41:58.710
Dejan: Whether for local or remote. There are empty room axis. And the important point than this blog is that the red line and the green lines right so the accesses to local DRM and the access is over a cable.

306
00:41:59.490 --> 00:42:13.890
Dejan: To a more directly attached will scale as low frequency, to a large extent, right, whereas the green to blue difference will not. So this is the actual hard latency going to switch over the wire and you know see realizing this realizing, at least four times.

307
00:42:16.890 --> 00:42:22.140
Dejan: So I think that's all we should see about this plot and we're probably running out of time or close

308
00:42:22.560 --> 00:42:24.150
Zvonimir Bandic: I think we can show the

309
00:42:24.330 --> 00:42:25.740
Ted Marena: Yeah. One more minute still

310
00:42:25.890 --> 00:42:27.150
Dejan: Yeah, this is

311
00:42:28.590 --> 00:42:31.320
Dejan: Yeah, you can. You can take over the site demo. If you want to

312
00:42:32.010 --> 00:42:37.170
Zvonimir Bandic: Yeah, so, so we show this and risk. Risk five

313
00:42:38.850 --> 00:42:49.110
Zvonimir Bandic: Summit in the booth. Last year we had the two boards those islands boards with the big dreams that vastly prepared.

314
00:42:49.800 --> 00:43:05.460
Zvonimir Bandic: And and we we had four risk five hearts running on each, each of the boards heart is a legal risk five lingo for a thread or so basically we have four cores.

315
00:43:06.450 --> 00:43:21.660
Zvonimir Bandic: And they're there 123 and four. They're all 64 bit rocket core you 54 and then we have a second node with course 910 11 and 12 running on another node.

316
00:43:22.560 --> 00:43:34.980
Zvonimir Bandic: And and when we, when we actually go into the CPU info VC VC eight course you know for course local to one node and for course.

317
00:43:35.820 --> 00:43:50.340
Zvonimir Bandic: From the other node connected to the switch. So this is kind of very cool. It was a sort of a first demonstration of the open open source symmetric multi processing protocol.

318
00:43:51.150 --> 00:44:02.220
Zvonimir Bandic: Or or cash coherence and this is something that Dan and I've been seeking, you know, for, for many years, and it's becoming a reality with that only extent.

319
00:44:05.970 --> 00:44:07.590
Zvonimir Bandic: And I think that's the that's

320
00:44:08.970 --> 00:44:09.870
Zvonimir Bandic: The last slide.

321
00:44:12.990 --> 00:44:13.620
Zvonimir Bandic: I tried to

322
00:44:13.800 --> 00:44:14.430
Ted Marena: Before

323
00:44:14.670 --> 00:44:17.040
Zvonimir Bandic: I was going into the chat window and trying to answer.

324
00:44:17.130 --> 00:44:18.330
Ted Marena: I see. Yeah, I was kind of

325
00:44:19.140 --> 00:44:22.380
Dejan: I'm typing up an answer to one of the questions though, how much these switches cost.

326
00:44:22.950 --> 00:44:32.040
Dejan: So I can just say the, the numbers around 9004 30 points and 7464 ports, but these switches programmable programmable with this price is about the same.

327
00:44:32.550 --> 00:44:41.700
Dejan: So they're not like, you know, I don't know what the expensive means to the person who asked the question, but these are not outrageous prices right this is all fairly reasonable for data center equipment.

328
00:44:42.510 --> 00:44:56.970
Ted Marena: I think the other though point is before we hand it off to Kurt, you know, we should also just let people know sort of the status. Right. Like, you may want to just sort of reiterate summarize for folks.

329
00:45:00.600 --> 00:45:10.260
Zvonimir Bandic: Yeah, I just want to add that cash coherence switches built as a custom device cost tremendous amount of millions of dollars.

330
00:45:10.770 --> 00:45:26.400
Zvonimir Bandic: Ethan It switches are not cheap. It's not something that typically are put into your house, but our normal sort of normal idea equipment that's fairly normal for any University Academic group company, etc. So they're, they're not prohibitively expensive.

331
00:45:29.280 --> 00:45:41.610
Ted Marena: Okay. Um, I think, I think the questions have been answered. But obviously if folks have some other you know feedback or questions. I also typed a couple of

332
00:45:42.210 --> 00:45:56.700
Ted Marena: web links in the in the Q AMP. A. So if people want more information on on the extend, there's a link to the GitHub for like a lot of the details the spec and so on, that they had had mentioned and then

333
00:45:57.270 --> 00:46:03.780
Ted Marena: Also, just for videos and sort of more instructional things. There's a Western Digital link.

334
00:46:05.220 --> 00:46:07.410
Ted Marena: So let's, let's turn it over to Kurt

335
00:46:09.270 --> 00:46:09.930
Curt Beckmann, Intel: Already, how many

336
00:46:10.860 --> 00:46:14.940
Zvonimir Bandic: Do I have to do something special to corporate CAN JUST GRAB UP stop share

337
00:46:15.060 --> 00:46:19.560
Curt Beckmann, Intel: And let's see what we got here and it's here try this.

338
00:46:23.520 --> 00:46:25.770
Curt Beckmann, Intel: Can you see me see my screen. Yes.

339
00:46:25.830 --> 00:46:29.640
Curt Beckmann, Intel: Yes. All right. Now when I go put it in presentation mode and everything.

340
00:46:30.090 --> 00:46:30.930
Curt Beckmann, Intel: Explodes you

341
00:46:33.900 --> 00:46:36.450
Curt Beckmann, Intel: Know, you can still see it right now. Why should my

342
00:46:37.230 --> 00:46:38.580
Ted Marena: Yes, it's still okay

343
00:46:38.700 --> 00:46:39.480
Curt Beckmann, Intel: Visual great

344
00:46:39.660 --> 00:46:40.050
Alright.

345
00:46:42.120 --> 00:46:49.530
Curt Beckmann, Intel: So we'll dive right in. Actually, I don't have much to say. Because it seems like the presentation was given earlier. I'm only joking, but

346
00:46:50.610 --> 00:46:56.190
Curt Beckmann, Intel: I was very pleased to see that that discussion of how I Tofino

347
00:46:57.660 --> 00:47:06.870
Curt Beckmann, Intel: Was able to be adapted to all these things. I kind of knew that before but it was even adapted to multiple versions. So this picture here is meant to show how like classic

348
00:47:07.500 --> 00:47:22.470
Curt Beckmann, Intel: Network elements work. They basically the network elements are essentially fixed function. That's the classic fixed function a sick and you don't get to decide how you want things to work the biggest bunch of basic kind of tells you how that's going to work.

349
00:47:23.520 --> 00:47:24.690
Curt Beckmann, Intel: And it's

350
00:47:28.830 --> 00:47:29.670
Curt Beckmann, Intel: You can't

351
00:47:30.870 --> 00:47:35.910
Curt Beckmann, Intel: Push down the network requirements into the basic moving on to the

352
00:47:36.990 --> 00:47:45.450
Curt Beckmann, Intel: The goal in Tofino case in the programmable network case. And I should say people was invented about 10 minutes before you know that

353
00:47:46.080 --> 00:47:54.720
Curt Beckmann, Intel: There's this pent up demand for something better. And as soon as people arrived. It seemed like there was a surge and barefoot was founded back in

354
00:47:55.260 --> 00:48:06.090
Curt Beckmann, Intel: 2013 or 2012 where everyone's i'm i'm at barefoot, but I wasn't there at that early was in the neighborhood, when it was founded doing. I remember all the energy so that

355
00:48:07.080 --> 00:48:16.320
Curt Beckmann, Intel: The point here is that we got to figure out what we want the network to do and that will, in turn, allow us to specify in our programmable programmable chip.

356
00:48:17.370 --> 00:48:21.780
Curt Beckmann, Intel: The behavior that we want on that chip. Now that Tofino chip as I build this out.

357
00:48:23.310 --> 00:48:40.650
Curt Beckmann, Intel: It's barely eat are not aware as was actually described a little bit ago that you turn that is the sort of least constrained widely available protocol we get all kinds of great studies and other devices that are they have lots of people out there competing to provide those to you.

358
00:48:41.820 --> 00:48:50.040
Curt Beckmann, Intel: In the toppino chip itself, there are Matt cores and I'll show you that in a minute. But essentially, we define what we want and we go. Push down into the Tofino

359
00:48:50.520 --> 00:49:09.420
Curt Beckmann, Intel: What the behavior is that we want and other than you know CRS ease at the end of packets and it's hard to get a lot too far below 64 bytes. It's not very even Ethernet aware, despite having either that Max, we cannot use too much of the Ethernet specifics

360
00:49:10.860 --> 00:49:19.230
Curt Beckmann, Intel: But they're they're sort of a limit, at some level, because now you get down into there's even physical layer stuff that's effectively either not specified now.

361
00:49:21.480 --> 00:49:22.290
Curt Beckmann, Intel: Alright, so

362
00:49:23.310 --> 00:49:30.330
Curt Beckmann, Intel: So how do we do this. Here's the complicated version. Right. You take before you stick it through a compiler and it generates a

363
00:49:31.710 --> 00:49:32.100
Curt Beckmann, Intel: A

364
00:49:34.800 --> 00:49:36.360
Curt Beckmann, Intel: Binary that is loaded into

365
00:49:36.420 --> 00:49:42.690
Ted Marena: Her real quick. So there was just a question. What does P for stand for. I know it's a programming language, maybe you can just quickly answer.

366
00:49:42.750 --> 00:49:59.850
Curt Beckmann, Intel: Oh yeah, sure. That seems like a good place to start. It stands for protocol independent packet processing programs are programmable protocol independent packet processing. I forget it's it's comes from the four P's and I tend to rearrange them sometimes but

367
00:50:01.410 --> 00:50:06.300
Curt Beckmann, Intel: It is a protocol that was conceived quite some time ago and it it

368
00:50:08.370 --> 00:50:16.470
Curt Beckmann, Intel: The idea is that we have a this is a fixed function version. So sort of classic Ethernet switch that I designed, you know, years ago.

369
00:50:17.250 --> 00:50:24.540
Curt Beckmann, Intel: Long before barefoot and you have a fixed par, sir, it does fixed look ups in a fixed sequence. So first we look up

370
00:50:25.290 --> 00:50:32.520
Curt Beckmann, Intel: Sort of Mac addresses and we decide if it's valid, you know, either net. And then we look up IP addresses and maybe we look up TCP ports.

371
00:50:32.850 --> 00:50:39.720
Curt Beckmann, Intel: We look up ankles and we do various things. Maybe we maybe we didn't do tunneling, but all that stuff was just basically hardwired

372
00:50:40.080 --> 00:50:46.680
Curt Beckmann, Intel: And you wouldn't have been able, well, I don't know how you could take that and do we extend on a on a chip like that.

373
00:50:47.490 --> 00:50:54.750
Curt Beckmann, Intel: On the programmable switch, though, we have the ability, it's a flexible Parker Parker, it's you actually define how you want to parse things in P for

374
00:50:55.530 --> 00:51:03.120
Curt Beckmann, Intel: You compile that and configure the the switch by good. Well, you don't get to see the details and you don't want to the compiler.

375
00:51:03.810 --> 00:51:13.320
Curt Beckmann, Intel: Does mappings on these shared memories and flexible look ups and the actions are also pretty flexible I'll show some other creative things that people have done with

376
00:51:14.040 --> 00:51:23.370
Curt Beckmann, Intel: Tofino and a little bit. And there's a variable number of stages. I mean, the chip has limits 12 for the first chip and 20 for the second chip that's

377
00:51:24.420 --> 00:51:27.150
Curt Beckmann, Intel: Just coming online and be in production later this year.

378
00:51:30.840 --> 00:51:42.720
Curt Beckmann, Intel: But the program may not use all those stages which can reduce your latency and there's various ways it can reassign memories and so on. So all that's very programmable how fields are assigned and so on.

379
00:51:43.800 --> 00:52:03.360
Curt Beckmann, Intel: We go through this past on ingress and then it essentially buffers up the result queues it up for transmission and there's another egress side processing opportunity. It's not required. So in some cases you can bypass the grasp or lower latency. If there is no

380
00:52:04.530 --> 00:52:07.140
Curt Beckmann, Intel: No manipulation of the packet that's required.

381
00:52:09.960 --> 00:52:19.980
Curt Beckmann, Intel: So, this leaves you with a customer definable switch and we say customer, from our perspective of firms like Cisco and Arista our customers.

382
00:52:20.910 --> 00:52:36.870
Curt Beckmann, Intel: Whether they decide to share that on to the end customer is is up to them. Sometimes, both are possible that they do their own definition. And then, you know, if you paid a licensing fee for the tools, then you can the end user, the operator, the network operator can also

383
00:52:38.340 --> 00:52:50.130
Curt Beckmann, Intel: Make changes. This is a huge. This is hugely valuable independent of Omni extent. But obviously, I'm the extent takes it kind of to a new level because it's it's really doing creative things down here.

384
00:52:51.660 --> 00:52:56.010
Curt Beckmann, Intel: Where you can define in relatively few numbers of tables.

385
00:52:57.630 --> 00:53:13.050
Curt Beckmann, Intel: How to do simple forwarding not based on traditional addressing or routing schemes, but you also have the ability to extend that. And as mentioned, you could use even some of the memory internally to The Tofino as shared memory for

386
00:53:14.310 --> 00:53:18.870
Curt Beckmann, Intel: You know, other devices that are connected to, to your switch or to your larger network.

387
00:53:22.380 --> 00:53:29.550
Curt Beckmann, Intel: I mentioned, you'll see what's another good thing to say about P for here, there, there are a number of extensions. There are ways that we can

388
00:53:30.690 --> 00:53:34.920
Curt Beckmann, Intel: create what are called external to do things that are kind of beyond what before does

389
00:53:35.670 --> 00:53:51.030
Curt Beckmann, Intel: Things like Bloom filters and other kind of creative things. I don't know how applicable. Those might be Dominic's to end. But I want to give you the idea that you could do something that could potentially combined on next end with other features in in your environment.

390
00:53:52.080 --> 00:53:55.590
Curt Beckmann, Intel: Here's the simple picture again. So it goes through the programmable Parker.

391
00:53:57.210 --> 00:54:04.770
Curt Beckmann, Intel: It parts or break things up into these sort of header fields that are there looked up in the tables and then

392
00:54:05.520 --> 00:54:12.540
Curt Beckmann, Intel: Manipulations are done through this max match action table and it's passed on to the next stage. And we've shown here that the

393
00:54:13.050 --> 00:54:22.680
Curt Beckmann, Intel: The header fields have changed between stage one and stage two, because of that manipulation and so on and so on. It goes through. There's only four stages, shown here, but

394
00:54:23.790 --> 00:54:33.360
Curt Beckmann, Intel: You know there's 12 are in the first gen and 20 in the second gen and this is just in eat grass or I mean sorry in ingress or egress as well.

395
00:54:34.410 --> 00:54:41.820
Curt Beckmann, Intel: And then they're put into a queue and and sent or or put into a 30 cent if it's on the growth side.

396
00:54:45.390 --> 00:54:57.030
Curt Beckmann, Intel: So the programmable parts are as we talked about already. It car carbs up those headers. We typically you know most of our users really are doing Ethernet and typically IP on top of Ethernet.

397
00:54:58.200 --> 00:55:05.520
Curt Beckmann, Intel: But sometimes, once you get past IP, then they start doing more creative things. I mean, you could do npls which I suppose is actually

398
00:55:06.150 --> 00:55:15.180
Curt Beckmann, Intel: In the layer between the internet and and IP, for example. So all these match action fields are quite generic. There's nothing particularly special about them.

399
00:55:16.080 --> 00:55:26.850
Curt Beckmann, Intel: And you can combine make very wide keys and the memories in here are allowed for exact matching longest brief so matching matching and

400
00:55:28.170 --> 00:55:39.420
Curt Beckmann, Intel: Hashing as well. So for large keys that's obviously useful because, you know, can't do a direct look up with a you know 64 bit key or 128 bit key.

401
00:55:41.610 --> 00:55:49.200
Curt Beckmann, Intel: Let's see, we talked about exact match T cam ternary match for those wide keys, where we don't want to

402
00:55:50.940 --> 00:55:54.660
Curt Beckmann, Intel: Don't care. Some bits 12 to 20 hardware stages.

403
00:56:00.840 --> 00:56:13.620
Curt Beckmann, Intel: Spilled out. No, sorry. So this is a we showed in Tupelo hash there we can have drops that us. There's another other like internal information that we can pass around. So for example, the queuing

404
00:56:14.280 --> 00:56:26.520
Curt Beckmann, Intel: State in the traffic manager is available and can be captured and shared we can respond to certain events like packet arrivals, but also time although we heard earlier that

405
00:56:28.020 --> 00:56:30.000
Curt Beckmann, Intel: timeouts are not appropriate for

406
00:56:32.520 --> 00:56:39.060
Curt Beckmann, Intel: Well at least tiling maybe for Omni extended I'm still learning the, the differences between you know what works and what doesn't.

407
00:56:39.690 --> 00:56:57.540
Curt Beckmann, Intel: But we can generate telemetry essentially in a variety of cases, we can do mirroring and so on, which is useful for obviously monitoring what's going on in your network and potentially that would be relevant for an omni extend an environment where we would send messages over

408
00:56:58.830 --> 00:57:08.490
Curt Beckmann, Intel: Maybe a traditional Ethernet link to to some other device based based on what's going on inside the the Omni extend sort of universe.

409
00:57:09.600 --> 00:57:12.270
Curt Beckmann, Intel: Those telemetry messages are are

410
00:57:13.350 --> 00:57:19.830
Curt Beckmann, Intel: Fully in the data path. So the control plane does not need to be involved, which is good because you can tend to overwhelm your control plane.

411
00:57:20.160 --> 00:57:28.530
Curt Beckmann, Intel: It's not good for the control plane and it's not good for the telemetry and so they can get sent off to something that's got more horsepower and can gather all that up.

412
00:57:29.580 --> 00:57:31.080
Curt Beckmann, Intel: And and then

413
00:57:33.570 --> 00:57:36.480
Curt Beckmann, Intel: Basically, draw the insights from the data as it arrives.

414
00:57:38.040 --> 00:57:52.200
Curt Beckmann, Intel: Here are some examples of creative applications that people have done with Tofino so switch ML. That's a machine learning done in the switch. Well, leveraging the capabilities of the switch. If you're into machine learning.

415
00:57:53.220 --> 00:57:58.680
Curt Beckmann, Intel: Training especially often the training data is so huge, you may be doing parallel

416
00:58:00.030 --> 00:58:17.010
Curt Beckmann, Intel: Parameter regression and multiple servers and they need to make their contributions all to each other in the classic model that's an n squared, kind of a problem, but by moving the parameter aggregation of these updates into the network effectively in the Tofino

417
00:58:18.180 --> 00:58:28.020
Curt Beckmann, Intel: Then that n squared problem turns to an order n problem. And they're actually like several percentage improvements. There's a secondary benefit as well by

418
00:58:28.590 --> 00:58:43.110
Curt Beckmann, Intel: When something's n squared. And it's interesting. It seemed like there was a parallel to the next 10 discussion earlier that things scale a lot better when things are order and then order n squared. So typically, the cluster size for switch ML are for

419
00:58:44.160 --> 00:58:44.670
Curt Beckmann, Intel: Large

420
00:58:45.750 --> 00:58:55.830
Curt Beckmann, Intel: Training clusters often maxes out around 16 so you tend to make the each of the servers as large as you can sometimes didn't get into 32 or 64 but when you have an order n.

421
00:58:58.830 --> 00:59:08.010
Curt Beckmann, Intel: Behavior, then you can scale that out to a much larger number without as much pain and as, as we've seen these. The, the databases, the data structures that are

422
00:59:08.550 --> 00:59:16.200
Curt Beckmann, Intel: Data lakes that people are crunching on are getting huge and they want to use as big a cluster as they can to get things to iterate quickly.

423
00:59:17.010 --> 00:59:29.370
Curt Beckmann, Intel: So that's a quite creative use case advanced congestion control. It does seem like our DMA was a part of this discussion as well. One of our end customers actually

424
00:59:30.630 --> 00:59:33.480
Curt Beckmann, Intel: Did some creative work, I'd mentioned the telemetry, which is

425
00:59:34.650 --> 00:59:46.350
Curt Beckmann, Intel: I empty. Well, they did a variation on telemetry it's telemetry usually goes to some, you know, external observer in this case they were using some of that telemetry capabilities to actually

426
00:59:47.520 --> 01:00:02.340
Curt Beckmann, Intel: Enhance the protocol improve the performance of rocky by either sending a head to the receiver information that would otherwise come later due to prioritization or sends back to the sender.

427
01:00:03.510 --> 01:00:11.940
Curt Beckmann, Intel: Sort of an early response. There's different variations of this that are being experimented with particularly around the rocky protocol very, you know, very

428
01:00:15.450 --> 01:00:16.410
Curt Beckmann, Intel: much improved.

429
01:00:17.850 --> 01:00:25.380
Curt Beckmann, Intel: Latency behavior, particularly on the endpoints, and as a result you want much you're much more sensitive to the latency in the network which is

430
01:00:26.460 --> 01:00:29.760
Curt Beckmann, Intel: Certainly very competitive on the Tofino but the ability to

431
01:00:31.380 --> 01:00:38.610
Curt Beckmann, Intel: To generate this advanced congestion control messages essentially programmatically without, you know, without responding silicon

432
01:00:39.420 --> 01:00:47.880
Curt Beckmann, Intel: That's pretty much a Tofino capability. If you want to look that up. That's an HP CC is the high precision congestion control paper.

433
01:00:48.450 --> 01:00:55.050
Curt Beckmann, Intel: You can look up switch and Mel as well. And then one of the other interesting ones is the telemetry have already mentioned before. That's another

434
01:00:55.890 --> 01:01:08.550
Curt Beckmann, Intel: Protocol, it's been in developed before.org developed it early. There's a IOM group now and I ETF that's working on that. The point I'm trying to make there is that telemetry is

435
01:01:09.600 --> 01:01:24.600
Curt Beckmann, Intel: There's lots of interest in it and it's slightly evolving and having a programmable devices is very useful. In this case, we haven't talked about, to my knowledge in that telemetry group, they're not talking yet about Omni extend and maybe they should and having a very programmable device.

436
01:01:26.190 --> 01:01:30.540
Curt Beckmann, Intel: lends itself to that so that we could do telemetry, you know,

437
01:01:31.680 --> 01:01:40.830
Curt Beckmann, Intel: Take Omni extend and then send telemetry message out some different port over Ethernet IP network. As I mentioned before,

438
01:01:43.200 --> 01:01:44.100
Curt Beckmann, Intel: So that was a

439
01:01:45.270 --> 01:01:50.460
Curt Beckmann, Intel: People in Tofino in a nutshell. Are there any other questions or

440
01:01:51.480 --> 01:01:52.530
Curt Beckmann, Intel: Any first questions.

441
01:01:55.560 --> 01:01:57.030
Curt Beckmann, Intel: Things I should have covered and didn't

442
01:01:58.950 --> 01:02:00.630
Jeffrey Osier-Mixon: There during have a chat.

443
01:02:01.080 --> 01:02:03.210
Ted Marena: Yeah, let's see, there's a question in the chat.

444
01:02:05.220 --> 01:02:07.290
Ted Marena: So I'm not sure. Can you see that

445
01:02:08.700 --> 01:02:09.300
Curt Beckmann, Intel: In a moment.

446
01:02:11.610 --> 01:02:16.680
Ted Marena: Let me, let me ask you, let me, I'll just ask it. Has anybody gotten P for to scale up to

447
01:02:16.680 --> 01:02:23.490
Ted Marena: 130 800 G for classification on FPGA days without eating up the entire FPGA real estate.

448
01:02:24.090 --> 01:02:27.540
Curt Beckmann, Intel: Uh huh. Well we on the FPGA side, I could

449
01:02:27.690 --> 01:02:31.590
Curt Beckmann, Intel: Probably get back to you on that. By talking to the FPGA group at Intel

450
01:02:33.240 --> 01:02:35.400
Curt Beckmann, Intel: Hundred G. I think so. I know.

451
01:02:38.400 --> 01:02:47.280
Curt Beckmann, Intel: I know that they've done at G. So then that was some some time ago. So I would think that they, they're able to do hundred g on an FPGA with people.

452
01:02:52.260 --> 01:02:52.650
Ted Marena: Okay.

453
01:02:54.930 --> 01:03:02.400
Ted Marena: So we can. We have a few more minutes here, we, you know, people want to ask questions, you can certainly ask it in the Q AMP. A or in the chat.

454
01:03:03.450 --> 01:03:13.050
Ted Marena: Via Mary, did you have any sort of like next steps are closing or did you just want to verbalize something I wasn't sure if you had sort of closing slides.

455
01:03:13.350 --> 01:03:14.250
Zvonimir Bandic: I go

456
01:03:14.910 --> 01:03:18.990
Zvonimir Bandic: Specific closing closing slide, but I would like to

457
01:03:20.190 --> 01:03:21.930
Zvonimir Bandic: Actually, I might I may even

458
01:03:23.610 --> 01:03:25.890
Ted Marena: So busy. Before you go on,

459
01:03:26.190 --> 01:03:29.220
Ted Marena: There's a question. The. Can you see the question in the Q AMP a

460
01:03:29.700 --> 01:03:35.070
Ted Marena: That was the female P for us or modified in the lab by Dan and Yvonne Amir

461
01:03:36.750 --> 01:03:38.160
Dejan: So I can probably take that one.

462
01:03:38.400 --> 01:03:38.970
Yeah.

463
01:03:40.290 --> 01:03:46.470
Dejan: Just to be clear, it was used or modified by neither Dan or I, this was some very bright people to

464
01:03:47.610 --> 01:03:54.900
Dejan: Work in my group. But yeah, we had not actually modify the switch right so the switch itself does not come.

465
01:03:55.500 --> 01:04:05.070
Dejan: With any kind of programming. Right. It's not secret specific like it. Actually, it will receive Ethernet frame packets, but it doesn't contain you know TCP IP or any, any of that.

466
01:04:05.850 --> 01:04:12.870
Dejan: Firmware on it. So all we did was we program the switch to a accepts a specific packet format. It's completely different for me than it

467
01:04:13.530 --> 01:04:18.510
Dejan: So all the changes that we did. Were you know in basically programming the chip itself.

468
01:04:18.960 --> 01:04:29.640
Dejan: And then there's a different project, which a lot of Miss showed. So we took the Tofino chip and just made a blackboard, the motherboard there instead of having front plate TCP ports has different kinds of connection.

469
01:04:31.080 --> 01:04:42.630
Dejan: Formats straight like this new so 51,002 with it's going to be PC agent five and expose that are bored, just as a test for different kind of short reach connections. Does that answer your question.

470
01:04:44.820 --> 01:04:45.120
Dejan: He says,

471
01:04:45.570 --> 01:04:46.110
Ted Marena: I think

472
01:04:46.440 --> 01:04:49.080
Zvonimir Bandic: Because Google has asked is muted. So

473
01:04:49.500 --> 01:04:55.020
Dejan: Yeah, so there's another person who wants to answer the question live. I'm not sure how to unmute

474
01:04:55.320 --> 01:04:57.540
Ted Marena: I know it.

475
01:04:58.290 --> 01:05:06.510
Ted Marena: Is a shin. And I think the other thing I would add is I believe that P for Cody is available on GitHub.

476
01:05:07.260 --> 01:05:12.330
Dejan: Correct. Well there. The code is very minimal at this point because we only have the forwarding code and the control plane setup. Right.

477
01:05:13.110 --> 01:05:17.190
Dejan: What we are really the most exciting thing would be to have the code that actually

478
01:05:17.880 --> 01:05:23.670
Dejan: You know stops these packets, the coherence messages and then acts as a directory. So the switch with Active Directory for coherence.

479
01:05:24.120 --> 01:05:33.660
Dejan: Because this would allow you to scale out way beyond you know for sockets or eight sockets that you can do today. So with once, which you can scale to 256 lanes. If you do one lane per

480
01:05:34.560 --> 01:05:45.960
Dejan: But we want. We have much higher ambitions right we have ambitious with 10s of thousands of nodes. So for this we would need actually much part of you for programming to limit this practical here, as mentioned these so we can scale better

481
01:05:48.720 --> 01:05:49.020
Ted Marena: Okay.

482
01:05:49.980 --> 01:05:50.220
Jeffrey Osier-Mixon: There was

483
01:05:50.370 --> 01:05:50.700
Curt Beckmann, Intel: Just

484
01:05:50.790 --> 01:05:56.730
Curt Beckmann, Intel: In the chat. I also want to clarify the point was made that when you get to switch. It's, it's

485
01:05:58.140 --> 01:06:05.040
Curt Beckmann, Intel: It doesn't have any of the Ethernet IP kind of it depends where you buy your switch. If you buy your switch the white box. Then, as, as mentioned, it's

486
01:06:06.930 --> 01:06:20.310
Curt Beckmann, Intel: A raw machine. And if you buy it from say Arista or or Cisco or something they they set a bunch of people in there for you so you you kind of have your choice. But yeah, the good news is, if you want to

487
01:06:20.790 --> 01:06:26.880
Curt Beckmann, Intel: If you want your own sandbox. It comes as a sandbox. If you get the right from the right source.

488
01:06:28.110 --> 01:06:37.650
Ted Marena: Yeah, there's another question. So see Lee had asked how does a switch OS compare with software defined network.

489
01:06:39.690 --> 01:06:41.880
Curt Beckmann, Intel: So talking about it from the switch side so

490
01:06:43.320 --> 01:06:55.020
Curt Beckmann, Intel: If you buy, as I mentioned, if you buy from Cisco Arista I think that sometimes they'll say that they they let you support Sonic. And my guess is that's a negotiation typically Sonic is Sonic is

491
01:06:55.440 --> 01:07:01.320
Curt Beckmann, Intel: I forget what the acronym is now centers are like asking you about before, but it's one of the open networking

492
01:07:02.340 --> 01:07:03.510
Curt Beckmann, Intel: protocols that is

493
01:07:04.860 --> 01:07:19.290
Curt Beckmann, Intel: Very SDN friendly and then there's also stratum that is even more SDN friendly. Both of those are available on two versions of that are available that run on white box Tofino switches. Now then, of course, the

494
01:07:20.850 --> 01:07:28.230
Curt Beckmann, Intel: The SDN controller is kind of your assignment or you know that's outside of the box but yeah Tofino was

495
01:07:29.490 --> 01:07:38.160
Curt Beckmann, Intel: Barefoot and Tokyo were essentially, you know, born in the SDN heyday, you know, they're really SDN heyday. So they're very SDN friendly.

496
01:07:43.680 --> 01:07:56.790
Ted Marena: Also, just to let everybody know I put in the chat box. One more time. The link for all these presentations. So let me sort of throw it over to me or I think you had something you were going to summarize

497
01:07:57.090 --> 01:07:59.820
Zvonimir Bandic: I would like, I don't know if I can share it again.

498
01:08:01.140 --> 01:08:01.950
Zvonimir Bandic: All right, they can't

499
01:08:02.130 --> 01:08:02.940
Ted Marena: You shouldn't be able to

500
01:08:03.990 --> 01:08:04.950
Dejan: Alright, so

501
01:08:05.160 --> 01:08:06.000
Zvonimir Bandic: I want to

502
01:08:08.760 --> 01:08:10.560
Zvonimir Bandic: showed this slide.

503
01:08:12.330 --> 01:08:15.750
Zvonimir Bandic: So these are the active. These are the currently and

504
01:08:16.800 --> 01:08:17.640
Difficult

505
01:08:21.660 --> 01:08:23.220
Zvonimir Bandic: Zoom problems.

506
01:08:26.550 --> 01:08:29.970
Zvonimir Bandic: So these are the current, current work groups.

507
01:08:31.080 --> 01:08:36.180
Zvonimir Bandic: Active in trips Alliance and the activities that we discussed today, which is a tiling

508
01:08:37.950 --> 01:08:51.390
Zvonimir Bandic: Tiling protocol stylization. The only extend which is starting over internet and also AI be specification for chip. Let's are discussing the interconnect word group. So I want to encourage

509
01:08:52.740 --> 01:08:58.740
Zvonimir Bandic: Individuals who are interested to visit chips allows get up and and

510
01:08:59.970 --> 01:09:13.800
Zvonimir Bandic: All these activities are open source activities. So we encourage people to participate and and our, our get up ideas, simple, just chips allies, and in addition to that,

511
01:09:16.740 --> 01:09:24.900
Zvonimir Bandic: Companies can join as members of chips alliance and reach many more resources and collaboration as well as a face to face.

512
01:09:26.400 --> 01:09:34.710
Zvonimir Bandic: Well, not now. Not all of our face to face, but certainly online meetings every two weeks, or the development of specifications.

513
01:09:37.800 --> 01:09:46.470
Ted Marena: Great. Alright, so, um, Wesley or Kurt. Did you guys have anything else you wanted to sort of summarize or China here.

514
01:09:46.860 --> 01:09:56.310
Curt Beckmann, Intel: A good mention of the open source. I don't have a slide related to it but P for. There's a lot of people are on GitHub. And there's a lot of openness in that community as well.

515
01:09:57.000 --> 01:10:05.850
Curt Beckmann, Intel: Just recently, when the acquisition of of barefoot by Intel has sort of supported additional openness.

516
01:10:06.750 --> 01:10:11.280
Curt Beckmann, Intel: I think there was some things that barefoot, so I joined barefoot just after the Intel acquisition

517
01:10:12.120 --> 01:10:20.160
Curt Beckmann, Intel: And what I know about barefoot before then is, is what I've heard from others that they they kept some stuff a little bit closed.

518
01:10:20.940 --> 01:10:34.710
Curt Beckmann, Intel: What what they felt for their own protection, they don't feel that sensitive to that now. And so they're opening up even more. And there is a big open source community around people running on Tofino

519
01:10:38.940 --> 01:10:39.420
Great.

520
01:10:41.220 --> 01:10:53.370
Ted Marena: Okay. So Wesley, or let's see. Maybe we'll just hang here for another minute. Let's see. We there is another question. Speaking of development, the open wire. They're just a couple of commits and the tiling repo.

521
01:10:55.350 --> 01:10:55.560
Ted Marena: This

522
01:10:55.710 --> 01:10:56.250
Zvonimir Bandic: Is what I can

523
01:10:57.090 --> 01:10:57.930
Zvonimir Bandic: I can answer that.

524
01:10:58.140 --> 01:10:58.620
Ted Marena: Go ahead.

525
01:10:59.010 --> 01:10:59.580
Wesley Terpstra: I didn't even know

526
01:11:02.130 --> 01:11:02.790
Zvonimir Bandic: Nothing has

527
01:11:03.870 --> 01:11:06.120
Zvonimir Bandic: Just recently been moved from

528
01:11:07.530 --> 01:11:07.950
Zvonimir Bandic: From

529
01:11:09.270 --> 01:11:14.070
Zvonimir Bandic: From the previous report so it's it's relatively new to the chips alliance.

530
01:11:16.140 --> 01:11:20.790
Zvonimir Bandic: But it's been around for, you know, super long time obviously more than five years.

531
01:11:23.640 --> 01:11:25.440
Ted Marena: And there's also a question.

532
01:11:26.460 --> 01:11:38.670
Ted Marena: Tile links not looks nice, but only use for risk five to build an. Soc. How shall we use the interconnect to work with other existing system IPs that were provided by other vendors in arm camp, for example.

533
01:11:39.180 --> 01:11:46.410
Wesley Terpstra: Of this one's really easy. There are open source implementations of piling converters to ABB HP xi.

534
01:11:46.950 --> 01:11:56.910
Wesley Terpstra: So if you want to do this, you can just wrap up those IPS with a converter and then attach it the talent network. We did that with all sorts of IP that we acquire when we build an actual chip.

535
01:11:58.860 --> 01:12:14.130
Ted Marena: And also for Omni extend. That's an open specification. So if someone wants to put it on a processor, other than a risk five processor, then that's also possible. So it's a it's an open standard. That's one of the key differentiators

536
01:12:14.490 --> 01:12:20.910
Zvonimir Bandic: Grew up in I know several startup companies that are using tiling in the non responsive tips.

537
01:12:22.710 --> 01:12:22.860
Oh,

538
01:12:24.180 --> 01:12:27.900
Zvonimir Bandic: It's, it's already. It's already happening. I don't know if they use the

539
01:12:28.950 --> 01:12:35.700
Zvonimir Bandic: Open Source bus adapters that that's mostly mentioned or some other ones, but it's it's being used.

540
01:12:49.260 --> 01:12:50.040
Zvonimir Bandic: They're still there.

541
01:12:50.640 --> 01:12:54.150
Ted Marena: I'm still here. Yeah, I just figured I'd give everybody another second before

542
01:12:55.020 --> 01:12:56.610
Ted Marena: You initially sign off.

543
01:12:58.290 --> 01:12:59.220
Ted Marena: This for me.

544
01:12:59.460 --> 01:13:00.270
Zvonimir Bandic: Is for you.

545
01:13:00.360 --> 01:13:04.770
Ted Marena: Well, the last thing you want is to clone me I can absolutely assure you that

546
01:13:06.240 --> 01:13:19.110
Ted Marena: Okay. So I think, you know, we have recorded this so we'll be posting it and I already sent out the link for the the box folder link for everything.

547
01:13:20.400 --> 01:13:33.060
Ted Marena: And so I don't think there's anything else we can hang here for just a few more seconds to see if anyone has a question. And if not, we can just, you know, wish everyone a good night thank everybody for attending.

548
01:13:39.750 --> 01:13:40.380
Zvonimir Bandic: Thank you. Good.

549
01:13:41.190 --> 01:13:48.720
Ted Marena: Thank you. Thanks. Kurt thanks fun Amir Thanks Dan. Thanks. Wesley, Jeff, I really appreciate your help on this as well.

550
01:13:51.030 --> 01:13:51.780
Dejan: Thank you everyone.

551
01:13:52.560 --> 01:13:53.220
Ted Marena: Your say

552
01:13:53.550 --> 01:13:55.260
Ted Marena: Yep. Bye bye. Have a good night.

553
01:13:55.560 --> 01:13:56.430
Zvonimir Bandic: Thank you. Bye bye.

554
01:14:27.360 --> 01:14:28.470
Ted Marena: Thanks, Jeff. For our Bye bye.

555
01:14:30.270 --> 01:14:30.810
Jeffrey Osier-Mixon: See you later.

556
01:14:31.080 --> 01:14:32.130
Zvonimir Bandic: Yeah I stayed

557
01:14:32.400 --> 01:14:33.510
Ted Marena: I stayed to stop the

558
01:14:33.510 --> 01:14:34.230
Recording

559
01:14:35.760 --> 01:14:38.670
Zvonimir Bandic: They're still comments coming on a on a on a

560
01:14:38.820 --> 01:14:39.930
Zvonimir Bandic: On a zoom chat so

561
01:14:40.110 --> 01:14:42.510
Ted Marena: Oh really, okay I hanging