WEBVTT 1 00:00:49.890 --> 00:00:50.700 Jeffrey Osier-Mixon: Okay, we are recording 2 00:00:51.000 --> 00:00:51.420 Okay. 3 00:00:52.560 --> 00:01:03.720 Ted Marena: Well welcome everybody to the risk five Bay Area Meetup. I would expect some of us may not be in the Bay Area, because of the virtual nature of the event. 4 00:01:04.380 --> 00:01:20.790 Ted Marena: This is a presentation on cash coherent memory fabric which is based on influence on risk five. My name is Ted Miranda and I'll be just doing a really quick introductory. Let me just share my screen. I just have a couple of slides. 5 00:01:22.140 --> 00:01:26.370 Ted Marena: So let me see if I can do that over here. So, 6 00:01:27.690 --> 00:01:34.530 Ted Marena: For those of you who are, you should be able to see my screen it says risk five with these bubbles. Yes. 7 00:01:35.040 --> 00:01:36.000 Ted Marena: Okay, great. So 8 00:01:37.020 --> 00:01:45.930 Ted Marena: Risk five. For those of you who are not aware. I think a lot of you are familiar with what it is. It's basically an instruction set architecture. It sets the specifications. 9 00:01:46.170 --> 00:01:58.650 Ted Marena: For what you can do to a processor core, it allows you to, you know, really have full control and the real key here is the openness of the specification 10 00:01:59.130 --> 00:02:13.410 Ted Marena: And the other part of the openness are of the specified specification. Is that you, you know, it just basically sets. What you need to do. It doesn't actually talk about the implementation or how to do it. So, 11 00:02:14.610 --> 00:02:28.980 Ted Marena: You know, really, that's where the group chips alliance where we're going to be talking about tile link tile link is actually a boss and interface based off of risk five and 12 00:02:30.210 --> 00:02:38.520 Ted Marena: We're also going to be talking about how Ty link has been extended and that's what yields on the extend and 13 00:02:41.610 --> 00:02:58.140 Ted Marena: The Intel Tofino switch is what we're using to do demonstrations and proof of concept for this capability. So what's really neat is that this is a an architecture that's really been influenced and enabled by risk five 14 00:02:58.590 --> 00:03:05.730 Ted Marena: So the group that all this works being done is called chips alliance. You can see number of organizations that are part of the group. 15 00:03:06.510 --> 00:03:11.370 Ted Marena: The organization develops open source hardware as well as software development tools. 16 00:03:11.910 --> 00:03:21.180 Ted Marena: And there's a number of work groups that are part of this. And so what we're going to be focusing on is really the interconnect workgroup 17 00:03:21.540 --> 00:03:34.650 Ted Marena: And two particular applications or developments within that is tile link as well as on the extend. So there's also a triple interface called AIB that's part of our interconnect. 18 00:03:35.460 --> 00:03:47.910 Ted Marena: That's also an Intel supported and sponsored project. So lots of different activities. You can see we have course work group and we have software tools work group and so on. 19 00:03:48.660 --> 00:03:54.060 Ted Marena: So chips alliance. If you want to know more, you can go to the chip science.org website. 20 00:03:54.990 --> 00:04:04.650 Ted Marena: The whole idea of the group is it's an open no barrier collaboration allows companies to share development resources lowers the cost of development. 21 00:04:05.100 --> 00:04:12.990 Ted Marena: And it can also provide a red hat model you can introduce something that's open and then monetize it put support services around it. 22 00:04:13.800 --> 00:04:32.490 Ted Marena: So that's the end of my presentation, I'm going to stop my share. I do want to let everybody know if you have questions, there is a Q AMP a box that I'll keep an eye on and monitor and then you can also, I believe there's like a place for you to raise your hand so 23 00:04:33.540 --> 00:04:37.890 Ted Marena: Jeff roe, I think, is there anything else that I should say before we hand the baton over this van Amir 24 00:04:39.000 --> 00:04:39.990 Jeffrey Osier-Mixon: No, I don't think so. 25 00:04:40.350 --> 00:04:41.850 Zvonimir Bandic: I think vessel is being first though. 26 00:04:42.840 --> 00:04:46.710 Ted Marena: Oh, sorry. Yes. Wesley. I'm sorry. Yeah. So Wesley, why don't you 27 00:04:47.850 --> 00:04:48.540 Ted Marena: Take over 28 00:04:49.800 --> 00:04:51.660 Ted Marena: And you should be able to do this. 29 00:04:57.510 --> 00:04:59.820 Jeffrey Osier-Mixon: Looks like Ted is frozen, but you can go right ahead. 30 00:05:05.220 --> 00:05:05.790 Jeffrey Osier-Mixon: Frozen now. 31 00:05:06.690 --> 00:05:07.410 Ted Marena: Are you there. 32 00:05:12.330 --> 00:05:13.530 Ted Marena: Yes, you're on mute. 33 00:05:18.180 --> 00:05:19.740 Jeffrey Osier-Mixon: He says he cannot unmute 34 00:05:20.130 --> 00:05:22.830 Jeffrey Osier-Mixon: Oh give me one second to fix that. 35 00:05:24.330 --> 00:05:27.390 Ted Marena: Okay, that would be difficult if he 36 00:05:27.420 --> 00:05:28.800 Ted Marena: Cannot tell you so. 37 00:05:28.950 --> 00:05:29.730 Jeffrey Osier-Mixon: Now, we can hear you. 38 00:05:29.940 --> 00:05:30.240 Okay. 39 00:05:32.460 --> 00:05:37.140 Wesley Terpstra: I don't know. I talked earlier, it worked. But again, anyway, let me just share my screen. 40 00:05:40.650 --> 00:05:45.030 Wesley Terpstra: And I hear you go, you can see me on the beach as well. Okay. 41 00:05:46.500 --> 00:05:53.910 Wesley Terpstra: So hi, I'm wessling I'm from see five and I was asked to give you guys a brief introduction to what talent is 42 00:05:55.500 --> 00:06:04.800 Wesley Terpstra: We use talent internally at sci fi for most of our shipping or connect and what we don't have to interface with other companies, obviously. 43 00:06:05.940 --> 00:06:21.630 Wesley Terpstra: But I'm going to give a quick overview how what it is, what it does, how it works, and what the future holds. So okay, so what is tiling. It's a pretty easy one. It's just a protocol for connecting masters like course was like like memory so 44 00:06:25.020 --> 00:06:30.870 Wesley Terpstra: That's it in a nutshell. Um, there are a lot of different bosses like that. One of the things that makes tying someone special is 45 00:06:31.470 --> 00:06:36.120 Wesley Terpstra: We've gone out of our way to make the protocol have a very simple message vocabulary. 46 00:06:36.540 --> 00:06:52.590 Wesley Terpstra: That works across different levels of complexity. So you can use the same protocol in very simple slave devices, but you can also use it in cash coherent situations where you need to transfer both the data and the ownership of blocks back and forth. 47 00:06:55.620 --> 00:07:05.190 Wesley Terpstra: So just to like walk you through sort of like the basics of this, the idea is that you have agents which are you know components in your SSE they need to talk to each other. 48 00:07:05.760 --> 00:07:12.510 Wesley Terpstra: And they can either act as a master Westlife because the talent protocol. It's a master slave protocol it on the front connects two agents together. 49 00:07:13.260 --> 00:07:20.730 Wesley Terpstra: And you move data across the link that connects them. So an example would be the master might say, I want to do a right, that's a quick message. 50 00:07:21.240 --> 00:07:29.580 Wesley Terpstra: And then the slave upon receipt of that message will respond with a access acknowledgement saying whether or not the second the the right succeeded. 51 00:07:30.270 --> 00:07:36.420 Wesley Terpstra: And of course you can use acknowledgments to do ordering of your operations and so on. But that's sort of in the weeds. We're not going to talk too much about 52 00:07:37.530 --> 00:07:44.520 Wesley Terpstra: You can also, of course, who gets access data. So at the simplest level of conformance you only have to support these two operations. That's it. 53 00:07:44.880 --> 00:08:01.320 Wesley Terpstra: So if you're writing like a spy controller or something. It just has to take the reason send up the data. Take the rights and write the data that much more the that's the URL here. So to obviously for tiling you L is for 54 00:08:03.000 --> 00:08:09.180 Wesley Terpstra: Man, I forgot I think oh yeah us for uncashed and Alice for lightweight. So this is the simplest version uncashed lightweight 55 00:08:09.570 --> 00:08:23.370 Wesley Terpstra: And then you have the uncashed heavyweight version, which adds atomic operation hints and a few other things that is important. So let's just take a look at what it looks like when you're talking about just the simplest level of talent, where you have uncashed 56 00:08:24.570 --> 00:08:29.370 Wesley Terpstra: Transactions, so get some points. So in this picture. You've got your agents visa the bubbles. 57 00:08:29.940 --> 00:08:36.150 Wesley Terpstra: And they're all connected by edges. Those are the tiling gauges. Notice in picture we always point from the master to the slave 58 00:08:36.540 --> 00:08:43.050 Wesley Terpstra: So tiling defines what goes on this wire well on this this this interconnect between the mastering the slave 59 00:08:43.530 --> 00:08:53.250 Wesley Terpstra: To make it just really clear. I've just label all of the edges. You can see every agent here acts as a master or slave on every edge a single master could be 60 00:08:53.760 --> 00:08:58.560 Wesley Terpstra: Sorry, a single agent could be both a slave on one time, Link. Link, and a mastering different time like link. 61 00:08:59.130 --> 00:09:10.860 Wesley Terpstra: Or in the case of like this router here. It can be a slave on two links in the mastermind two links so agents are connected multiple to multiple tiling links, potentially, and they can take on either the master or slave role or above. 62 00:09:12.180 --> 00:09:23.790 Wesley Terpstra: The talent protocol. Again, it's only talking about what happens on these edges that connect the agents. We don't tell you what you have to do inside an agent. That's the micro architecture that's up to the designer so 63 00:09:25.560 --> 00:09:31.920 Wesley Terpstra: Here's just an example, say the processor wants to do a read it can issue a get message to its immediate neighbor. 64 00:09:32.400 --> 00:09:42.840 Wesley Terpstra: So it says from the master the cores. The master ISSUES TO GET request to the slave and then the slave is force under obligation to answer that guy, but he might, you know, send it through the network. 65 00:09:43.890 --> 00:09:52.980 Wesley Terpstra: Specially to get messages sent over each of these talent links till eventually gets the memory. And then of course the memory can't respond with the data and 66 00:09:53.460 --> 00:10:01.620 Wesley Terpstra: The answers come back. So what we've seen here is we generally drawing the picture is master slave points for master slave, but sometimes the message is going the opposite direction. 67 00:10:02.730 --> 00:10:07.350 Wesley Terpstra: Okay, so I mean that's pretty straightforward. I think everyone's familiar with bosses like that. 68 00:10:08.730 --> 00:10:16.260 Wesley Terpstra: There is one thing, though, that you might want to improve. So get some quotes essentially the ownership of the data. The slave. What do I mean by that, I mean, 69 00:10:17.010 --> 00:10:25.380 Wesley Terpstra: So it's very simple that like if you're a master and you send a request out firstly, you're only sending 211 person that your slave connection, obviously. 70 00:10:25.680 --> 00:10:29.700 Wesley Terpstra: But if you're like an interconnect like that river in the picture here, and you have to decide. 71 00:10:30.180 --> 00:10:40.080 Wesley Terpstra: Message came in, where do I send it. It's really simple. You just know where to send it because it's whoever owns that address that's where it goes. So it's very easy to like route out when you're only having get some points. 72 00:10:40.650 --> 00:10:46.230 Wesley Terpstra: And also it's very easy to resolve any kind of ordering or race conditions because if you're a slave. That's memory. 73 00:10:46.560 --> 00:10:51.150 Wesley Terpstra: The requests come in and when they arrive at you, the memory you decide the order of the operations. 74 00:10:51.390 --> 00:11:05.100 Wesley Terpstra: So this is a really simple model work with and it's pretty decent. If you're doing like high throughput latency insensitive kinds of operations like the MA. So like if you're streaming a lot of right data to or from like a disc or the network. This is a pretty good model. 75 00:11:06.240 --> 00:11:17.070 Wesley Terpstra: But obviously, I wouldn't be talking if there wasn't something here to better. The question is, what if the data is far away. Perhaps you know across the chip or even over the network and you want to use it more than one time. 76 00:11:18.750 --> 00:11:33.150 Wesley Terpstra: And that so also that if you get if you're in that situation. Those repeated accesses my waist traffic and weeks and weeks power, but particularly for processors inside sci fi. Of course we we sell processors. So we care a lot about this scenario. 77 00:11:34.470 --> 00:11:44.010 Wesley Terpstra: Process or processes are quite sensitive to latency. So if you do like a load on a processor and a lot of instructions depend on it. You want to get your answer quite quickly. So 78 00:11:44.520 --> 00:11:51.990 Wesley Terpstra: That really hurts performance that you can't answer the query without crossing the chip. So that's where we come to transfer ownership. 79 00:11:52.560 --> 00:12:00.090 Wesley Terpstra: So talent lets you move where the block is on. So just a quick terminology thing a block has just a chunk of memory. 80 00:12:00.630 --> 00:12:10.770 Wesley Terpstra: Typically 64 bytes or no hundred 28 are in this range, right, just a chunk of memory and that check a memory can move throughout the system like who is responsible. So, and 81 00:12:11.010 --> 00:12:16.770 Wesley Terpstra: Again, in the simplest scenario we're talking about initially. It's always the slave owns the catch block. 82 00:12:17.040 --> 00:12:25.650 Wesley Terpstra: So if the master wants to perform operations as a service request to the slave the slave does whatever the thing was like get her report and it gives the answer back 83 00:12:26.130 --> 00:12:34.860 Wesley Terpstra: So that's in that scenario, the only person with a copy of the the block is the slave. That's the unique case right only one agent has it. And it's this life. 84 00:12:35.700 --> 00:12:42.780 Wesley Terpstra: You can also when you have data movement. And sorry, an ownership movement. You can move that relationship and a cash so 85 00:12:43.050 --> 00:12:53.130 Wesley Terpstra: The master can become the person who has the only unique copy and then he's able to do in his reads and writes without having to go across the link. So he's taken the block from the slave and he owns that block. 86 00:12:54.390 --> 00:12:59.130 Wesley Terpstra: And if a slave wants to do an operation. It's not allowed to anymore has to ask for the plug back from the master. 87 00:12:59.700 --> 00:13:09.570 Wesley Terpstra: And finally, there's this case where you can have it shared. So both the slave and the master have a copy, and they can both there for service reads, but only. Well, neither of them can 88 00:13:09.990 --> 00:13:17.160 Wesley Terpstra: Perform right until a change in ownership happens that's that one of the sides is unique. So that's the way tiling ownership operates. 89 00:13:17.670 --> 00:13:22.740 Ted Marena: Leslie, just a hold on a second. I just wanted to see if you might be able to answer a question that came in on the chat. 90 00:13:22.950 --> 00:13:24.600 Wesley Terpstra: Yeah, I was trying to figure how to work that and 91 00:13:24.600 --> 00:13:28.110 Ted Marena: Okay, yeah, I mean, do you want to just answer it live really quick. All right. 92 00:13:28.620 --> 00:13:33.660 Ted Marena: Go ahead. The question is are there any restrictions and how quickly the sweat slave needs to respond to a request. 93 00:13:33.840 --> 00:13:48.900 Wesley Terpstra: Now the talent protocol explicitly forbid you from in fact building timers into the protocol, because the we're going to come to this later, but the talent protocol if you build your system conforming to the requirements of the specification. The system cannot deadlock. 94 00:13:49.260 --> 00:13:50.640 Wesley Terpstra: So if you have timeouts. 95 00:13:50.880 --> 00:13:58.080 Wesley Terpstra: In a system for slow responding slave, you can win situation for you like answer a message twice like 96 00:13:58.560 --> 00:13:59.760 Wesley Terpstra: send the request to slave 97 00:14:00.300 --> 00:14:11.190 Wesley Terpstra: The interconnect the side the slave is down and responds with, like, an air and then later the slave response to the actual answer. Those are really difficult problems to deal with in a real system. So actually the timing protocols, just like 98 00:14:11.640 --> 00:14:16.650 Wesley Terpstra: Flat out, you're not allowed to put timers and unless you're interfacing with the devices, not to be buggy, and if you do 99 00:14:16.740 --> 00:14:19.650 Wesley Terpstra: The timer cut that device off completely. So 100 00:14:19.920 --> 00:14:24.540 Wesley Terpstra: There is no baked into the protocol time limit. In fact, the opposite. 101 00:14:25.530 --> 00:14:33.720 Wesley Terpstra: But in practice, obviously, you know, the, the time it takes respond. It's going to depend on the memory, you're talking to, if it's something on ships and he'll probably hundred nanoseconds DDR something 102 00:14:34.200 --> 00:14:38.460 Wesley Terpstra: You know, cash if you're not going to cash, you know, less than. And then the second you're going out to 103 00:14:39.420 --> 00:14:41.490 Ted Marena: Leslie just sorry. Keep going. We just wanted 104 00:14:41.490 --> 00:14:42.660 Ted Marena: To get that quick answer. 105 00:14:42.750 --> 00:14:44.370 Wesley Terpstra: Part three. I got my side. 106 00:14:45.630 --> 00:14:47.490 Wesley Terpstra: I moving on. So, 107 00:14:47.520 --> 00:14:52.740 Wesley Terpstra: So how can ownership, the exchange. So we've seen in this picture that the ownership can move from the slaves. The master or be shared. 108 00:14:53.160 --> 00:14:59.520 Wesley Terpstra: So to do that we need to have messages that can do it. So the, the operations can be initiated by Master or slave 109 00:15:00.000 --> 00:15:08.310 Wesley Terpstra: And the operation might be causing the master to increase his ownership or the slave increases ownership. So for this sort of quadrant four things. There's 110 00:15:08.700 --> 00:15:22.230 Wesley Terpstra: Obviously for possible messages. So the choir message is a message and master sends to obtain the block. So in this picture. That would be like you started with the block owned by the slave master issues and acquire and then end up with the master owning the block. 111 00:15:23.400 --> 00:15:32.190 Wesley Terpstra: Similarly, you could have a master initiated release. This is really common and caches where you want to pull something new in but you don't have space, you have to get rid of something I already have 112 00:15:32.580 --> 00:15:38.850 Wesley Terpstra: So that's where you like I have this block. I don't want any more here slave, you can have it back again. That's a release message. 113 00:15:39.720 --> 00:15:47.760 Wesley Terpstra: Another really important one is probes. This is where you have a say in interconnectivity. For example, this router here say 114 00:15:48.120 --> 00:15:55.530 Wesley Terpstra: At one core takes the block and the other core wants to perform and access. So he asked the router for it. The rotor hospital to get it back. So that's the probe. 115 00:15:55.950 --> 00:15:59.940 Wesley Terpstra: For that you can send a request back to a master and be like, Hey, give me that block back 116 00:16:00.930 --> 00:16:10.050 Wesley Terpstra: So the slave can recover its ownership. There is actually a fourth potential thing that very few protocols implemented tiling also doesn't currently not this, although there has been talk of potentially having it. 117 00:16:10.380 --> 00:16:18.300 Wesley Terpstra: This is the so called stash. This would be were a slave decide. So the master should actually have a block. An example might be like a really starts 118 00:16:18.780 --> 00:16:27.420 Wesley Terpstra: A really smart storage controller that has seen many access patterns and you knows that if you read this blog and this block. You're going to want that block next sort of like a reverse Prefecture. 119 00:16:27.750 --> 00:16:38.970 Wesley Terpstra: And so when the season XCOR asking and say, like here have this block. But regardless, these, these messages here to the choir releasing prob supporting for those, that's what brings you to the third conformance level of talent talent cached. 120 00:16:40.260 --> 00:16:48.930 Wesley Terpstra: Okay, so how do you actually build a boss like this and you know a bunch of things you need to know the types of messages that are sent we talked about all the important messages in Thailand now. 121 00:16:49.500 --> 00:16:54.810 Wesley Terpstra: I'm using some of the field definitions, things like you know the size of the block being transferred was scented 122 00:16:55.530 --> 00:17:00.840 Wesley Terpstra: What's the address so on. You can read the spec for that sort of stuff really important aspect of any 123 00:17:01.320 --> 00:17:07.380 Wesley Terpstra: Bus protocol and particular cash coherent bus protocols is order and what things happen in what order. 124 00:17:07.770 --> 00:17:21.390 Wesley Terpstra: Like what messages may wait on other messaging what messages must wait on other messages. So in Thailand that's that's dealt with by like we have five priority levels that we defined in the spec and every message has a specified priority and 125 00:17:22.530 --> 00:17:24.390 Wesley Terpstra: You are not allowed to wait on 126 00:17:25.560 --> 00:17:35.640 Wesley Terpstra: lower priority message to serve as a higher priority one in a nutshell. And if you follow those rules then like I said earlier, actually you can prove that a composed system of 127 00:17:36.180 --> 00:17:50.040 Wesley Terpstra: correctly implemented tiling blocks cannot deadlock, there's one last thing you need to define, of course, which is how do you code the message on the wire. The nice thing with talent is you consider lots of lots of different ways. And in fact we have done so. So this is the 128 00:17:50.070 --> 00:17:51.270 Ted Marena: Nicely. Hold on a quick second 129 00:17:51.810 --> 00:17:54.900 Ted Marena: Can you address the question that came in is, it 130 00:17:55.020 --> 00:17:55.830 Wesley Terpstra: I don't see them. 131 00:17:56.100 --> 00:17:57.960 Ted Marena: Okay, yeah. So I'll ask it. 132 00:17:58.470 --> 00:18:08.790 Ted Marena: If they're if there's multiple masters across the network. What tile link. What will tiling say about memory coherence if masters don't have the same latency. 133 00:18:10.080 --> 00:18:16.140 Ted Marena: I think that's the question. And that may not be for tiling maybe that is an omni extend question. 134 00:18:16.200 --> 00:18:22.710 Wesley Terpstra: Oh, well, it's just a question for both, I think, since I'm next. In this talent network, but 135 00:18:24.660 --> 00:18:26.580 Wesley Terpstra: The answer okay so 136 00:18:28.200 --> 00:18:39.300 Wesley Terpstra: So if you have mastered at different latency. I mean, that's sort of like a new system. Right. So, for example, my desk right now I've got four VCU one on FPGA is have on extended if one of the FPGA. 137 00:18:39.870 --> 00:18:45.540 Wesley Terpstra: Has a cash block and a different one wants it. Well, then there is a high latency penalty for get that block back 138 00:18:46.020 --> 00:19:02.130 Wesley Terpstra: So accessing so bouncing a cash block between distant masters does take longer opposite than bouncing catch block to the masters that are on the same ship. So if you have to move a cache block a long distance, it will take longer. I mean, yeah. 139 00:19:04.080 --> 00:19:04.560 Ted Marena: Thanks. 140 00:19:04.980 --> 00:19:16.500 Wesley Terpstra: So the this picture here is a picture of the contract protocol we use in this 1.9 spec or 1.8 spec. I can actually see mounts like at 1.8 spec. 141 00:19:17.430 --> 00:19:25.260 Wesley Terpstra: So in this spec. There's this is showing like an example get message. I'm not going to go into the details. If you care about this serialization, you can read the spec. 142 00:19:26.250 --> 00:19:37.680 Wesley Terpstra: Somewhat relevant as you see the the priority here on the left, that's letter A and D. So when you're doing just straight normal tiling uncashed. So tell us, or to you. Ah, 143 00:19:38.790 --> 00:19:47.100 Wesley Terpstra: You only need two priorities. That's the AMD priority. So here we see like a get messages went out the Access Act data comes back and other gamma Cisco access data comes back. 144 00:19:48.600 --> 00:19:57.510 Wesley Terpstra: Right. So there's different ways you can see realize, Thailand, you can do it on the parallel with the the 1.8 spec specifies a parallel ready valid bus so that's 145 00:19:58.110 --> 00:20:12.750 Wesley Terpstra: That's this one here, this is the one we mostly use it internally at side by one chip. Some people have done things knock like where they take on ship the tiling messages, turn them into packets then send them over the knock 146 00:20:14.250 --> 00:20:15.510 Wesley Terpstra: The idea here is just 147 00:20:18.270 --> 00:20:26.490 Wesley Terpstra: Tiling as a lot of wires when it's parallel to keep god that's on this picture. But if you have like an address boss, you know, it's potentially like 40 or more bits wide and the date of us can be wide. 148 00:20:26.670 --> 00:20:32.880 Wesley Terpstra: So if you want to save on wires. Like you can turn it into a knock where it's sort of like a mini serialized protocol on chip. 149 00:20:33.450 --> 00:20:42.570 Wesley Terpstra: You can also do it off chip. So we'd like a packet based parallel credit US version that we call chip link that was in a product we shipped. I'm not sure how many years ago now for quite a while ago. 150 00:20:42.990 --> 00:20:48.120 Wesley Terpstra: About its proprietary and you could you could also do it over the network. So I refer you to the next talk about that. 151 00:20:49.680 --> 00:20:56.820 Wesley Terpstra: Here's a just a picture of like real chip and we've taped out. Not sure how long ago, quite a while, at this point. 152 00:20:58.140 --> 00:21:06.480 Wesley Terpstra: You can see that Scott like four cores here. Bunch of tiling switch. There's a cash er controller, a bunch of slaves. 153 00:21:07.170 --> 00:21:16.020 Wesley Terpstra: This is chilling thing where you can go over to another chip to prototype like IP you'd like to test your risk flags or sci fi chip. This is one of our chips. This is sci fi 154 00:21:17.340 --> 00:21:19.800 Wesley Terpstra: Freedom unleashed 514 155 00:21:21.480 --> 00:21:25.260 Wesley Terpstra: So you can see, though, that the whole thing is glued together by tiling gauges. So you've got 156 00:21:25.980 --> 00:21:41.670 Wesley Terpstra: Between the course. The course, he's a Linux capable of course these when they talk to the to the speaking the coherent version tiling on these edges here and the core is like a management court doesn't have caches all the cash doesn't really count that. 157 00:21:42.690 --> 00:21:47.520 Wesley Terpstra: You see I caches and chemical here. I didn't rest five usually. So this is just the uncashed protocol. 158 00:21:48.720 --> 00:21:52.470 Wesley Terpstra: When you get down the backside. The hell to you're just speaking the uncashed version again. 159 00:21:54.330 --> 00:22:00.360 Wesley Terpstra: Let me get at the chip link here. So the chorus can actually go and they can cache memory from another chip. So if this had 160 00:22:00.630 --> 00:22:07.680 Wesley Terpstra: Say a DDR controller here, you can actually pull that block all the way over here all the way to the other one caching. Whoops, have a core so 161 00:22:08.520 --> 00:22:20.640 Wesley Terpstra: Again, that's sort of the idea also on the extent right years you're going to take remote memory and you can use it in your own local cache. So even though normally to go over here on your first axis is like maybe two microseconds. 162 00:22:20.910 --> 00:22:25.320 Wesley Terpstra: When you're using a normal here, you're back into the, you know, low nanoseconds access time 163 00:22:26.430 --> 00:22:31.740 Wesley Terpstra: But yeah, that's it. All those going all these peripherals here and all speaking the TL UL protocol, because 164 00:22:32.910 --> 00:22:37.560 Wesley Terpstra: Why would you do more of these are easy peripherals just speak the simplest protocol. So in summary, 165 00:22:41.460 --> 00:22:46.080 Wesley Terpstra: I like to find a single message vocabulary. I didn't mention this earlier. 166 00:22:47.100 --> 00:22:49.140 Wesley Terpstra: I meant to but I got derailed the 167 00:22:50.160 --> 00:22:58.020 Wesley Terpstra: This is really helpful for interoperability, because you can just take like these different components that we've got with the different conformance levels of talent and just plug them together. 168 00:22:58.230 --> 00:23:12.150 Wesley Terpstra: And it works. And if any of you have worked with like amber protocols, but things like xi HP a chai. They all have different semantics like the messages are different. The protocol fields are different, the ordering rules are different. And so you end up with this really expensive. 169 00:23:13.560 --> 00:23:18.420 Wesley Terpstra: Conversion wherever you want to go from one protocol to another. So the nice thing about using Thailand is 170 00:23:19.140 --> 00:23:31.440 Wesley Terpstra: So one protocol. It just has sort of levels of conformance when you go between those levels, the semantics don't change. So the order semantics of the same the rules for one message responded with what messages don't change. 171 00:23:33.330 --> 00:23:38.310 Wesley Terpstra: You only, you don't have to worry too much about the impedance mismatch with the protocols. 172 00:23:38.700 --> 00:23:45.240 Wesley Terpstra: And because we have the three conformance levels. You only have to pay for what you need. Like here we had all these tiny peripherals. They could just use a simple protocol. 173 00:23:45.810 --> 00:23:52.050 Wesley Terpstra: And not pay law hard records are a lot of them, but here on the core is, you know, we want to pay hardware for caching. So we do 174 00:23:52.890 --> 00:23:58.740 Wesley Terpstra: I'm finally because there's this singular vocabulary doesn't mean there's only one way to transmitted 175 00:23:59.010 --> 00:24:08.820 Wesley Terpstra: And so that's where we have different ways of realizing, like we're talking about. You can do a parallel like a parallel boss will be addressing data in parallel, you can pack a ties it you can manage the transmission of 176 00:24:09.810 --> 00:24:14.130 Wesley Terpstra: Your message is pretty valid or credit based entrepreneurship. So that's it. 177 00:24:14.160 --> 00:24:15.540 Ted Marena: What question. One more question. 178 00:24:15.900 --> 00:24:21.090 Ted Marena: Goes tiling dash c support versus serialization for grant data messages. 179 00:24:23.700 --> 00:24:39.360 Wesley Terpstra: Burst cereal. I mean, yes, I mean, Grant data messages are almost always burst. So yes, are you asking me this question was directed about on the extent, in which case the answer is still yes all messages sent over on extend our sampling packet form so 180 00:24:39.870 --> 00:24:41.790 Wesley Terpstra: You're kind of comes first and then the data payload. 181 00:24:42.510 --> 00:24:51.090 Ted Marena: And then we had one other question. Let's see, do the agent nodes perform a handshake to decide which variant of the protocol to us. 182 00:24:51.600 --> 00:24:58.440 Wesley Terpstra: Know, so on ship you know this ahead of time, if you're going to do that would be a waste of hardware. So generally, you just 183 00:24:59.370 --> 00:25:03.810 Wesley Terpstra: You hook up agents that conform correctly. So let's sci fi. We have another technology we 184 00:25:04.320 --> 00:25:15.900 Wesley Terpstra: Have not talked about diplomacy, which does this sort of handshake at compile time so that you only put down hardware that you actually need. But there's no handshake. Once the chip is like built like 185 00:25:16.980 --> 00:25:21.780 Wesley Terpstra: That would be a total waste like a little slave, they had to like negotiate before they could turn on that would 186 00:25:21.780 --> 00:25:23.490 Dejan: Say, say, where's this is the end. Can you guys hear me. 187 00:25:24.060 --> 00:25:33.720 Dejan: I guess. Yeah. So this was probably a question relating to also to me, extend right so in case. So I'm extent it's a different situation. Yeah. So can you touch on that. 188 00:25:33.810 --> 00:25:35.010 Wesley Terpstra: Okay, sure, sure. So 189 00:25:36.240 --> 00:25:48.510 Wesley Terpstra: I was trying to answer the tiling question. I was timing. But yes, if it's on the extended you're plugging to gather chips that don't know, but each other, then yes, there is. We're working on and off the official name is still screech but at one point it was called that screech protocol. 190 00:25:50.010 --> 00:25:58.470 Wesley Terpstra: The idea there is that when you plan to chips together over on the extent they broadcast their capabilities with essentially device tree and then a coordinator 191 00:25:58.470 --> 00:26:01.110 Wesley Terpstra: Node assembles the the aggregate system. 192 00:26:01.320 --> 00:26:02.580 Wesley Terpstra: And then tells it running answer. 193 00:26:02.940 --> 00:26:12.090 Wesley Terpstra: So that's how discovery works when you are actually hot plugging hardware again in a chip, though. That's very different. There's no hot plugin it ships. So you wouldn't want to pay that off. 194 00:26:13.290 --> 00:26:13.530 Wesley Terpstra: Right. 195 00:26:13.560 --> 00:26:23.490 Ted Marena: And then the last question, which I think maybe an omni extend question as well. But the picture you showed the freedom unleashed had four cores. Can it be 16 or more 196 00:26:24.810 --> 00:26:26.820 Ted Marena: Talent better for cash coherence. 197 00:26:26.940 --> 00:26:36.450 Wesley Terpstra: Right, so, so the tiling protocol just tells you, like, what message you send back and forth. It doesn't have any. I mean, it doesn't say anything about micro architecture. 198 00:26:36.840 --> 00:26:42.030 Wesley Terpstra: If the question is how many cores, can you build that's more questions. My architecture. So 199 00:26:42.720 --> 00:26:48.210 Wesley Terpstra: You'll notice in this picture, we have a switch. What's not sending this picture to the bank. Here the bank del to has four banks. 200 00:26:48.720 --> 00:27:01.290 Wesley Terpstra: And if you have a switch, which is a crossbar from four to four, the cost of switch grows up quite radically. So, this particular micro architecture probably would not scale to, well, if you put 16 cores down right here. I got one hand, we would never build a system that way. 201 00:27:02.040 --> 00:27:10.590 Wesley Terpstra: We'd probably stamp out this whole gray box four times and then connect them with like an L three or another level of some kind of coherence management there. 202 00:27:10.800 --> 00:27:21.000 Wesley Terpstra: So the talent protocol says nothing about the limits. It's all about what your bike architecture is I don't think this micro architecture scales up to 16 cores in the gray box, but you can stamp out for the gray boxes. No problem. 203 00:27:21.420 --> 00:27:31.920 Wesley Terpstra: You can build a ring topology instead of a crossbar that connected things together. So, I mean, it's really scalability questions or micro architecture questions the tiling protocol stays the same, regardless 204 00:27:32.340 --> 00:27:48.210 Ted Marena: Okay, well, why don't we. Let's move on. So as Vladimir and Dan, why don't you guys take over the screen and Wesley, maybe you can look at the question and see if you can type an answer. So we do need to keep going here. Okay. 205 00:27:48.360 --> 00:27:49.200 Wesley Terpstra: I'll try and forget it. 206 00:27:49.590 --> 00:27:49.980 Okay. 207 00:27:52.110 --> 00:27:54.360 Ted Marena: So as bonomy. Are you going to share 208 00:27:58.830 --> 00:28:00.840 Jeffrey Osier-Mixon: Maybe some difficulty coming with me. 209 00:28:02.490 --> 00:28:03.270 Zvonimir Bandic: Am I sharing 210 00:28:03.540 --> 00:28:05.010 Ted Marena: Yes, we can see it now. 211 00:28:05.850 --> 00:28:06.810 Jeffrey Osier-Mixon: All right. Excellent. 212 00:28:07.230 --> 00:28:08.130 Alright, so 213 00:28:09.360 --> 00:28:13.440 Zvonimir Bandic: Good. A good evening, everybody. Thank you so much for coming to this Meetup. 214 00:28:14.490 --> 00:28:24.750 Zvonimir Bandic: Excited to talk about the Omni extend the fabric 20 minutes, probably not enough, but we'll kind of try to Dan and I will try to cover interesting things. 215 00:28:25.800 --> 00:28:30.360 Zvonimir Bandic: How this actually came about and then we'll share some exciting. 216 00:28:31.980 --> 00:28:35.970 Zvonimir Bandic: Experimental resolves the data and measurements with done on the on the extent 217 00:28:38.430 --> 00:28:45.510 Zvonimir Bandic: Um, so I'll start with data center CPU vision and recently announced the back plane reference 218 00:28:46.800 --> 00:28:50.490 Zvonimir Bandic: And then we'll jump into details of our 219 00:28:51.540 --> 00:28:54.810 Zvonimir Bandic: Architecture and implementation and some performance mode. 220 00:28:56.100 --> 00:28:56.760 Zvonimir Bandic: So, 221 00:29:00.510 --> 00:29:08.490 Zvonimir Bandic: So the vision of a data center of the future is a obviously multi threaded multi core CPU. The see 222 00:29:10.230 --> 00:29:15.510 Zvonimir Bandic: Importance of getting two out of order is five course that can support general purpose. 223 00:29:16.890 --> 00:29:20.730 Zvonimir Bandic: Operating System and software applications, but the, we believe that 224 00:29:21.990 --> 00:29:34.920 Zvonimir Bandic: Most important clinic activity for the CPU systems are those related to memory accelerators and this is basically how extend came out to be. We wanted to develop open source. 225 00:29:35.520 --> 00:29:45.510 Zvonimir Bandic: Interface open source implementation of the interfaces that can be used bring memory in large amounts of persistent memory to the system. 226 00:29:45.990 --> 00:29:57.150 Zvonimir Bandic: Was kind of the primary motivation along the AV obvious Lee discovered how interesting to actually bring cash coherent cores into the system. 227 00:29:57.540 --> 00:30:05.250 Zvonimir Bandic: That can be used for specialized workloads important machine learning and inference and allows those subsystems to 228 00:30:05.940 --> 00:30:17.790 Zvonimir Bandic: Share memory coherently with the main cores that are running us. So that's really, that's really how this kind of came to life. Many years ago, and 229 00:30:18.720 --> 00:30:27.810 Zvonimir Bandic: And the vision is really to allow then a large number of risk five compute nodes to connect to university shared memory exactly like when this picture. 230 00:30:29.010 --> 00:30:39.630 Zvonimir Bandic: The, the very much like the vision of using programmable switches the base version of the protocol will work with Ethernet switches, as well as some top of L to 231 00:30:40.770 --> 00:30:45.690 Zvonimir Bandic: Programmable switches allowed to do interesting things potentially bring more performance in the system. 232 00:30:46.560 --> 00:30:57.210 Zvonimir Bandic: And this kind of architecture enables memory appliances. So for those of you who have heard the about segregation and desegregation with storage where you can virtually move 233 00:30:57.810 --> 00:31:03.210 Zvonimir Bandic: Storage and associated with a specific node or pool, the storage for multiple nodes to present 234 00:31:03.960 --> 00:31:14.760 Zvonimir Bandic: To present shared storage. This is exactly kind of things that we envision possibly on the extent and we are interested in building memory appliances for this kind of system. 235 00:31:15.270 --> 00:31:34.590 Zvonimir Bandic: And obviously, they are also very interested in the earliest five coherent nodes that can offload the AI workloads and and one one amazing interesting thing here we are in the early days where we're hooking up Western Digital and see five hardware, but 236 00:31:36.390 --> 00:31:48.630 Zvonimir Bandic: Everything about the standard is an open source and a lot of implementation pieces are also available for example tiling implementation is is is available can be can be obtained at the chips allies. Get that 237 00:31:49.140 --> 00:32:05.700 Zvonimir Bandic: So this, this impressive will mean that multiple different vendors will be able to use on the extent and and the people and marketing managers will be able to envision cash coherent systems with equipment for different vendors that can share memory. 238 00:32:07.410 --> 00:32:21.120 Zvonimir Bandic: And this is a amazing board. It took us a while to to design this, our former CTO Martin think unveiled this this board. The at the 239 00:32:22.230 --> 00:32:34.500 Zvonimir Bandic: In the presentation and the risk five summit last year and we are bringing this board up in the lab, it's, it's not fully brought up yet Corona got this kind of slow, slow down. 240 00:32:35.190 --> 00:32:42.870 Zvonimir Bandic: As the total setup into Silla scopes and they use the prediction for the board kind of has almost hundred kilos. So it wasn't exactly portable 241 00:32:44.220 --> 00:32:50.580 Zvonimir Bandic: But, but we are working on it. And then we got at least you have the 242 00:32:51.930 --> 00:33:08.640 Zvonimir Bandic: Course FTP lanes and small form factor lanes up and running will will share the designs with the whole world. I think currently the designs are available to chips alliance members in the in the private in the private folders and and we plan to share the board as well. 243 00:33:10.260 --> 00:33:11.580 Zvonimir Bandic: So, so now 244 00:33:13.320 --> 00:33:27.360 Zvonimir Bandic: I want to jump into I'm going to let Dan explain a little bit of a background on Omni extend and show our demo and performance results, then I hope you are unmuted, and 245 00:33:27.900 --> 00:33:30.060 Dejan: I can try talking. He's my audio good 246 00:33:30.330 --> 00:33:31.530 Zvonimir Bandic: Yes. Perfect, yeah. 247 00:33:31.560 --> 00:33:37.680 Dejan: Okay, I'm not reading video because my audio is choppy from time to time. So I don't want to make it worse. Alright. So, hello everyone. 248 00:33:38.430 --> 00:33:53.640 Dejan: This is a slide that sort of shows a very high level vision of what on the extended entirely been and I imagined it on this for mostly you know this stuff, but just to make sure, if there are any folks who are not familiar with what a coherence protocol means this is sort of the 249 00:33:54.990 --> 00:34:01.620 Dejan: Coming from the direction of memory centric systems are building systems with large amounts of desegregating memory. 250 00:34:02.430 --> 00:34:07.980 Dejan: When you ask a random person on the street, how to build a system with, you know, a petabyte of main memory. 251 00:34:08.430 --> 00:34:15.930 Dejan: They'll typically draw you something like the first picture on top. I'm not sure if I can point them zoom. I don't think I can point. So I'll just described, where the pictures are 252 00:34:16.860 --> 00:34:25.950 Dejan: So the top picture with the with the DRAM and a DMA engine is typically how folks build systems with large amounts of memory that can be accessed from 253 00:34:26.370 --> 00:34:33.840 Dejan: More than say 16 cores, or whatever the largest core come two days. And this is a system where you have an Rd ma 254 00:34:34.380 --> 00:34:43.560 Dejan: Setup. So you have a nick with a DMA engine and then that Nick has some software that runs its and sets up some tables that translate your local vision of virtual memory to a remote 255 00:34:44.070 --> 00:34:55.950 Dejan: You know vision of a cloud address space and then you call these Salter functions to fetch chunks of memory that are typically you know Kilobyte, Megabyte sizes. That's this little 256 00:34:56.610 --> 00:35:07.710 Dejan: Lightning bolt that you see. So this is issues and mainly shift for us as a storage maker is that for certain kinds of new by the addressable storage has very little latency. 257 00:35:08.490 --> 00:35:19.260 Dejan: The, the whole process. So fetching and context switching. And so, and it's just takes the longer it takes on the order of several microseconds. And we have technologies that are in a microsecond range. So we need something faster. 258 00:35:20.220 --> 00:35:31.380 Dejan: Now a lot of you may have heard about these lists, such as Josie, and that's the picture on the right where you do away with the DMA engine and you have some sort of tables that translate your local address based on the cloud. 259 00:35:32.040 --> 00:35:37.200 Dejan: Addressing but you still have some software that manages the straight and in case of Jersey. This was called the librarian. 260 00:35:37.830 --> 00:35:52.800 Dejan: And there are other solutions out there. So this is again not what we mean by Omni extended by link. So the picture on the bottom is correct, right. The picture on the bottom means that you have unified coherence space, right. So all your CPUs and all your memory controllers. 261 00:35:54.000 --> 00:36:07.530 Dejan: Sit on the other side of some imaginary fabric and your page tables. So that's what the diagram shows there and your PhD was may not be local to your. Soc. Right. That's sort of the key thing here. So if you move on to the next slide. 262 00:36:12.690 --> 00:36:14.850 Dejan: I'm not. It's coming slowly. 263 00:36:18.480 --> 00:36:23.820 Dejan: It's coming very slowly. I think we can skip this one. How much, how much time do we have 264 00:36:23.940 --> 00:36:36.450 Zvonimir Bandic: Here we have about 1010 minutes so i think i think the audience will probably get most excited if you maybe start from the, from the data plan implementation slide. Unless you want 265 00:36:37.740 --> 00:36:38.850 Zvonimir Bandic: You know, one of the tour earlier. 266 00:36:39.090 --> 00:36:42.840 Dejan: Sure, yeah. So let's talk about this. Right. So can you 267 00:36:43.290 --> 00:36:44.190 Dejan: Maximize 268 00:36:44.640 --> 00:36:46.170 Zvonimir Bandic: The one day. One day data plane. 269 00:36:47.490 --> 00:36:48.270 Dejan: Yeah, that one that 270 00:36:50.550 --> 00:36:51.270 Dejan: Basically to 271 00:36:51.570 --> 00:36:52.650 Zvonimir Bandic: Get to the demo and 272 00:36:52.650 --> 00:36:59.760 Dejan: Yeah, okay. So before we jump into the details, right. So this is the actual bit format or the packet packing that we have 273 00:37:00.150 --> 00:37:05.100 Dejan: And I believe this is all the obsolete, because I think we've changed back to alto framing. 274 00:37:05.910 --> 00:37:11.100 Dejan: But the point here is that we've looked at are quite a number. So there was one question right, how do we 275 00:37:11.550 --> 00:37:17.130 Dejan: How do we choose Thailand. Right. So we didn't really choose styling. The problem was, there was nothing else out there. 276 00:37:17.730 --> 00:37:24.990 Dejan: And similar problem that we had with scaling tiling off chip is that there was really nothing out there that was open and not proprietary 277 00:37:25.470 --> 00:37:33.060 Dejan: And also that will be widely available without everybody having to make Association. And so essentially we converge on Ethernet. 278 00:37:33.750 --> 00:37:40.170 Dejan: And the reason being that that's pretty much the only fabric out there. It's completely open unencumbered and it's really widely available. 279 00:37:41.040 --> 00:37:46.800 Dejan: It does have some quirks. Right. There's no in order delivery and it's not a reliable fabric, but we can work around these things. 280 00:37:47.340 --> 00:37:59.550 Dejan: And then essentially what we do is we just package the styling messages into alto frames and we call that on the extent. So when somebody asks you what is on the extent it's essentially telling or you can 281 00:38:00.600 --> 00:38:17.280 Dejan: Now to present. One of the questions, it's certainly going to get asked is this Ethan at specific by no means right so this is styling cover anything is possible that we chose it and it just because you can use it without having to jump through too many hoops. So this slide shows 282 00:38:18.600 --> 00:38:23.610 Dejan: I think this is the obsolete 0.1 version of the header. So the new header is actually available on GitHub. 283 00:38:24.450 --> 00:38:33.480 Dejan: You can download the specification document and look up the fields. And if you happen to be using one of the smart switches the P for switch from barefoot networks. Now, Intel 284 00:38:34.320 --> 00:38:46.260 Dejan: You can actually parse the styling packets in the switch and you can do interesting things to them. So there was one of the great appeals of this method is that we could actually process coherence protocol in the switch on the fly. 285 00:38:48.120 --> 00:38:51.900 Dejan: Okay. Can we move on to perhaps the performance slide. 286 00:38:56.700 --> 00:39:05.490 Dejan: Yeah, this was an example of before their flesh by. So this is an example set up the women a lab we're probably don't want to describe. Yes. Let's talk about performance. So this slide here. 287 00:39:06.660 --> 00:39:14.430 Dejan: shows our measurements. By the way, you can reproduce this yourself. All you need to do is you need to buy two Xilinx eval boards and this is Alex. 288 00:39:16.260 --> 00:39:17.940 Dejan: Or that runs the see five 289 00:39:19.410 --> 00:39:26.280 Dejan: I believe this was before, and now it's you 74 cores. So the binaries are available from GitHub for download. 290 00:39:26.760 --> 00:39:33.630 Dejan: And what you can do is you can set this up in three different ways right so you can set it up as a single board. So that's the red line on this blog. 291 00:39:34.380 --> 00:39:39.480 Dejan: Or you can hook up to boards, back to back with an Ethernet cable that those are two green lines. 292 00:39:39.930 --> 00:39:49.080 Dejan: And then you can go through a switch right so you can basically connect two boards to a Ethernet switch just any of the shots, which works or you can use barefoot Tofino like in this block. 293 00:39:49.800 --> 00:40:00.210 Dejan: And these measurements are a little bit out of date because we're still running a 50 megahertz on the FPGA course. But essentially what they show is on the x axis is a random 294 00:40:00.780 --> 00:40:10.320 Dejan: Memory access test and the x axis shows the size of the window within which you were touching 64 by cache lines and then the Y axes are 295 00:40:10.920 --> 00:40:18.270 Dejan: Latency measured in milliseconds imaginary clock cycles. Again, the milliseconds look pretty bad, but keep in mind that we're running an RPG a 50 megahertz. 296 00:40:19.230 --> 00:40:32.100 Dejan: So what you see is on the left, right, the L one means that every test that you access within a 32 K window. Pretty much hits the cash right so you only see to clock cycle one latency. 297 00:40:32.820 --> 00:40:39.540 Dejan: And then as you transition to a larger window up to one megabyte you slowly start to hit Elko cash more and more. So you see 298 00:40:40.470 --> 00:40:48.330 Dejan: About 1,000,000,027 clock cycle latency for or sorry 22 second latency for I'll do and then beyond one megabyte you start hitting DRAM. 299 00:40:48.900 --> 00:40:56.670 Dejan: And so now things get interesting. Right. So the red line shows the latency to access local DRAM on the SOC on which the CPU cores. 300 00:40:57.570 --> 00:41:08.370 Dejan: The green lines show access to both local and remote DRAM, meaning that remote, meaning that you're on the other board right through an Ethernet cable beyond makes them vertical 301 00:41:08.880 --> 00:41:19.170 Dejan: And you see that these lines are quite close. The reason being that you're not getting the red line here for the local beer and is that every coherence request has to go across and check that the other cash doesn't have it. 302 00:41:19.950 --> 00:41:28.620 Dejan: And this was an early version of this protocol where we didn't have directory implementation. So essentially all requests have to check all the caches and obviously this is not scaling. 303 00:41:29.220 --> 00:41:35.760 Dejan: So, pretty soon we'll have a directory enabled implementation that's going to eliminate this checking all caches. 304 00:41:36.720 --> 00:41:46.500 Dejan: And then as you move to the blue lines. These are the same Layton sees as you when you go to the switch. You see that the switch ads about 1.2 microsecond store round trip checking 305 00:41:47.070 --> 00:41:58.710 Dejan: Whether for local or remote. There are empty room axis. And the important point than this blog is that the red line and the green lines right so the accesses to local DRM and the access is over a cable. 306 00:41:59.490 --> 00:42:13.890 Dejan: To a more directly attached will scale as low frequency, to a large extent, right, whereas the green to blue difference will not. So this is the actual hard latency going to switch over the wire and you know see realizing this realizing, at least four times. 307 00:42:16.890 --> 00:42:22.140 Dejan: So I think that's all we should see about this plot and we're probably running out of time or close 308 00:42:22.560 --> 00:42:24.150 Zvonimir Bandic: I think we can show the 309 00:42:24.330 --> 00:42:25.740 Ted Marena: Yeah. One more minute still 310 00:42:25.890 --> 00:42:27.150 Dejan: Yeah, this is 311 00:42:28.590 --> 00:42:31.320 Dejan: Yeah, you can. You can take over the site demo. If you want to 312 00:42:32.010 --> 00:42:37.170 Zvonimir Bandic: Yeah, so, so we show this and risk. Risk five 313 00:42:38.850 --> 00:42:49.110 Zvonimir Bandic: Summit in the booth. Last year we had the two boards those islands boards with the big dreams that vastly prepared. 314 00:42:49.800 --> 00:43:05.460 Zvonimir Bandic: And and we we had four risk five hearts running on each, each of the boards heart is a legal risk five lingo for a thread or so basically we have four cores. 315 00:43:06.450 --> 00:43:21.660 Zvonimir Bandic: And they're there 123 and four. They're all 64 bit rocket core you 54 and then we have a second node with course 910 11 and 12 running on another node. 316 00:43:22.560 --> 00:43:34.980 Zvonimir Bandic: And and when we, when we actually go into the CPU info VC VC eight course you know for course local to one node and for course. 317 00:43:35.820 --> 00:43:50.340 Zvonimir Bandic: From the other node connected to the switch. So this is kind of very cool. It was a sort of a first demonstration of the open open source symmetric multi processing protocol. 318 00:43:51.150 --> 00:44:02.220 Zvonimir Bandic: Or or cash coherence and this is something that Dan and I've been seeking, you know, for, for many years, and it's becoming a reality with that only extent. 319 00:44:05.970 --> 00:44:07.590 Zvonimir Bandic: And I think that's the that's 320 00:44:08.970 --> 00:44:09.870 Zvonimir Bandic: The last slide. 321 00:44:12.990 --> 00:44:13.620 Zvonimir Bandic: I tried to 322 00:44:13.800 --> 00:44:14.430 Ted Marena: Before 323 00:44:14.670 --> 00:44:17.040 Zvonimir Bandic: I was going into the chat window and trying to answer. 324 00:44:17.130 --> 00:44:18.330 Ted Marena: I see. Yeah, I was kind of 325 00:44:19.140 --> 00:44:22.380 Dejan: I'm typing up an answer to one of the questions though, how much these switches cost. 326 00:44:22.950 --> 00:44:32.040 Dejan: So I can just say the, the numbers around 9004 30 points and 7464 ports, but these switches programmable programmable with this price is about the same. 327 00:44:32.550 --> 00:44:41.700 Dejan: So they're not like, you know, I don't know what the expensive means to the person who asked the question, but these are not outrageous prices right this is all fairly reasonable for data center equipment. 328 00:44:42.510 --> 00:44:56.970 Ted Marena: I think the other though point is before we hand it off to Kurt, you know, we should also just let people know sort of the status. Right. Like, you may want to just sort of reiterate summarize for folks. 329 00:45:00.600 --> 00:45:10.260 Zvonimir Bandic: Yeah, I just want to add that cash coherence switches built as a custom device cost tremendous amount of millions of dollars. 330 00:45:10.770 --> 00:45:26.400 Zvonimir Bandic: Ethan It switches are not cheap. It's not something that typically are put into your house, but our normal sort of normal idea equipment that's fairly normal for any University Academic group company, etc. So they're, they're not prohibitively expensive. 331 00:45:29.280 --> 00:45:41.610 Ted Marena: Okay. Um, I think, I think the questions have been answered. But obviously if folks have some other you know feedback or questions. I also typed a couple of 332 00:45:42.210 --> 00:45:56.700 Ted Marena: web links in the in the Q AMP. A. So if people want more information on on the extend, there's a link to the GitHub for like a lot of the details the spec and so on, that they had had mentioned and then 333 00:45:57.270 --> 00:46:03.780 Ted Marena: Also, just for videos and sort of more instructional things. There's a Western Digital link. 334 00:46:05.220 --> 00:46:07.410 Ted Marena: So let's, let's turn it over to Kurt 335 00:46:09.270 --> 00:46:09.930 Curt Beckmann, Intel: Already, how many 336 00:46:10.860 --> 00:46:14.940 Zvonimir Bandic: Do I have to do something special to corporate CAN JUST GRAB UP stop share 337 00:46:15.060 --> 00:46:19.560 Curt Beckmann, Intel: And let's see what we got here and it's here try this. 338 00:46:23.520 --> 00:46:25.770 Curt Beckmann, Intel: Can you see me see my screen. Yes. 339 00:46:25.830 --> 00:46:29.640 Curt Beckmann, Intel: Yes. All right. Now when I go put it in presentation mode and everything. 340 00:46:30.090 --> 00:46:30.930 Curt Beckmann, Intel: Explodes you 341 00:46:33.900 --> 00:46:36.450 Curt Beckmann, Intel: Know, you can still see it right now. Why should my 342 00:46:37.230 --> 00:46:38.580 Ted Marena: Yes, it's still okay 343 00:46:38.700 --> 00:46:39.480 Curt Beckmann, Intel: Visual great 344 00:46:39.660 --> 00:46:40.050 Alright. 345 00:46:42.120 --> 00:46:49.530 Curt Beckmann, Intel: So we'll dive right in. Actually, I don't have much to say. Because it seems like the presentation was given earlier. I'm only joking, but 346 00:46:50.610 --> 00:46:56.190 Curt Beckmann, Intel: I was very pleased to see that that discussion of how I Tofino 347 00:46:57.660 --> 00:47:06.870 Curt Beckmann, Intel: Was able to be adapted to all these things. I kind of knew that before but it was even adapted to multiple versions. So this picture here is meant to show how like classic 348 00:47:07.500 --> 00:47:22.470 Curt Beckmann, Intel: Network elements work. They basically the network elements are essentially fixed function. That's the classic fixed function a sick and you don't get to decide how you want things to work the biggest bunch of basic kind of tells you how that's going to work. 349 00:47:23.520 --> 00:47:24.690 Curt Beckmann, Intel: And it's 350 00:47:28.830 --> 00:47:29.670 Curt Beckmann, Intel: You can't 351 00:47:30.870 --> 00:47:35.910 Curt Beckmann, Intel: Push down the network requirements into the basic moving on to the 352 00:47:36.990 --> 00:47:45.450 Curt Beckmann, Intel: The goal in Tofino case in the programmable network case. And I should say people was invented about 10 minutes before you know that 353 00:47:46.080 --> 00:47:54.720 Curt Beckmann, Intel: There's this pent up demand for something better. And as soon as people arrived. It seemed like there was a surge and barefoot was founded back in 354 00:47:55.260 --> 00:48:06.090 Curt Beckmann, Intel: 2013 or 2012 where everyone's i'm i'm at barefoot, but I wasn't there at that early was in the neighborhood, when it was founded doing. I remember all the energy so that 355 00:48:07.080 --> 00:48:16.320 Curt Beckmann, Intel: The point here is that we got to figure out what we want the network to do and that will, in turn, allow us to specify in our programmable programmable chip. 356 00:48:17.370 --> 00:48:21.780 Curt Beckmann, Intel: The behavior that we want on that chip. Now that Tofino chip as I build this out. 357 00:48:23.310 --> 00:48:40.650 Curt Beckmann, Intel: It's barely eat are not aware as was actually described a little bit ago that you turn that is the sort of least constrained widely available protocol we get all kinds of great studies and other devices that are they have lots of people out there competing to provide those to you. 358 00:48:41.820 --> 00:48:50.040 Curt Beckmann, Intel: In the toppino chip itself, there are Matt cores and I'll show you that in a minute. But essentially, we define what we want and we go. Push down into the Tofino 359 00:48:50.520 --> 00:49:09.420 Curt Beckmann, Intel: What the behavior is that we want and other than you know CRS ease at the end of packets and it's hard to get a lot too far below 64 bytes. It's not very even Ethernet aware, despite having either that Max, we cannot use too much of the Ethernet specifics 360 00:49:10.860 --> 00:49:19.230 Curt Beckmann, Intel: But they're they're sort of a limit, at some level, because now you get down into there's even physical layer stuff that's effectively either not specified now. 361 00:49:21.480 --> 00:49:22.290 Curt Beckmann, Intel: Alright, so 362 00:49:23.310 --> 00:49:30.330 Curt Beckmann, Intel: So how do we do this. Here's the complicated version. Right. You take before you stick it through a compiler and it generates a 363 00:49:31.710 --> 00:49:32.100 Curt Beckmann, Intel: A 364 00:49:34.800 --> 00:49:36.360 Curt Beckmann, Intel: Binary that is loaded into 365 00:49:36.420 --> 00:49:42.690 Ted Marena: Her real quick. So there was just a question. What does P for stand for. I know it's a programming language, maybe you can just quickly answer. 366 00:49:42.750 --> 00:49:59.850 Curt Beckmann, Intel: Oh yeah, sure. That seems like a good place to start. It stands for protocol independent packet processing programs are programmable protocol independent packet processing. I forget it's it's comes from the four P's and I tend to rearrange them sometimes but 367 00:50:01.410 --> 00:50:06.300 Curt Beckmann, Intel: It is a protocol that was conceived quite some time ago and it it 368 00:50:08.370 --> 00:50:16.470 Curt Beckmann, Intel: The idea is that we have a this is a fixed function version. So sort of classic Ethernet switch that I designed, you know, years ago. 369 00:50:17.250 --> 00:50:24.540 Curt Beckmann, Intel: Long before barefoot and you have a fixed par, sir, it does fixed look ups in a fixed sequence. So first we look up 370 00:50:25.290 --> 00:50:32.520 Curt Beckmann, Intel: Sort of Mac addresses and we decide if it's valid, you know, either net. And then we look up IP addresses and maybe we look up TCP ports. 371 00:50:32.850 --> 00:50:39.720 Curt Beckmann, Intel: We look up ankles and we do various things. Maybe we maybe we didn't do tunneling, but all that stuff was just basically hardwired 372 00:50:40.080 --> 00:50:46.680 Curt Beckmann, Intel: And you wouldn't have been able, well, I don't know how you could take that and do we extend on a on a chip like that. 373 00:50:47.490 --> 00:50:54.750 Curt Beckmann, Intel: On the programmable switch, though, we have the ability, it's a flexible Parker Parker, it's you actually define how you want to parse things in P for 374 00:50:55.530 --> 00:51:03.120 Curt Beckmann, Intel: You compile that and configure the the switch by good. Well, you don't get to see the details and you don't want to the compiler. 375 00:51:03.810 --> 00:51:13.320 Curt Beckmann, Intel: Does mappings on these shared memories and flexible look ups and the actions are also pretty flexible I'll show some other creative things that people have done with 376 00:51:14.040 --> 00:51:23.370 Curt Beckmann, Intel: Tofino and a little bit. And there's a variable number of stages. I mean, the chip has limits 12 for the first chip and 20 for the second chip that's 377 00:51:24.420 --> 00:51:27.150 Curt Beckmann, Intel: Just coming online and be in production later this year. 378 00:51:30.840 --> 00:51:42.720 Curt Beckmann, Intel: But the program may not use all those stages which can reduce your latency and there's various ways it can reassign memories and so on. So all that's very programmable how fields are assigned and so on. 379 00:51:43.800 --> 00:52:03.360 Curt Beckmann, Intel: We go through this past on ingress and then it essentially buffers up the result queues it up for transmission and there's another egress side processing opportunity. It's not required. So in some cases you can bypass the grasp or lower latency. If there is no 380 00:52:04.530 --> 00:52:07.140 Curt Beckmann, Intel: No manipulation of the packet that's required. 381 00:52:09.960 --> 00:52:19.980 Curt Beckmann, Intel: So, this leaves you with a customer definable switch and we say customer, from our perspective of firms like Cisco and Arista our customers. 382 00:52:20.910 --> 00:52:36.870 Curt Beckmann, Intel: Whether they decide to share that on to the end customer is is up to them. Sometimes, both are possible that they do their own definition. And then, you know, if you paid a licensing fee for the tools, then you can the end user, the operator, the network operator can also 383 00:52:38.340 --> 00:52:50.130 Curt Beckmann, Intel: Make changes. This is a huge. This is hugely valuable independent of Omni extent. But obviously, I'm the extent takes it kind of to a new level because it's it's really doing creative things down here. 384 00:52:51.660 --> 00:52:56.010 Curt Beckmann, Intel: Where you can define in relatively few numbers of tables. 385 00:52:57.630 --> 00:53:13.050 Curt Beckmann, Intel: How to do simple forwarding not based on traditional addressing or routing schemes, but you also have the ability to extend that. And as mentioned, you could use even some of the memory internally to The Tofino as shared memory for 386 00:53:14.310 --> 00:53:18.870 Curt Beckmann, Intel: You know, other devices that are connected to, to your switch or to your larger network. 387 00:53:22.380 --> 00:53:29.550 Curt Beckmann, Intel: I mentioned, you'll see what's another good thing to say about P for here, there, there are a number of extensions. There are ways that we can 388 00:53:30.690 --> 00:53:34.920 Curt Beckmann, Intel: create what are called external to do things that are kind of beyond what before does 389 00:53:35.670 --> 00:53:51.030 Curt Beckmann, Intel: Things like Bloom filters and other kind of creative things. I don't know how applicable. Those might be Dominic's to end. But I want to give you the idea that you could do something that could potentially combined on next end with other features in in your environment. 390 00:53:52.080 --> 00:53:55.590 Curt Beckmann, Intel: Here's the simple picture again. So it goes through the programmable Parker. 391 00:53:57.210 --> 00:54:04.770 Curt Beckmann, Intel: It parts or break things up into these sort of header fields that are there looked up in the tables and then 392 00:54:05.520 --> 00:54:12.540 Curt Beckmann, Intel: Manipulations are done through this max match action table and it's passed on to the next stage. And we've shown here that the 393 00:54:13.050 --> 00:54:22.680 Curt Beckmann, Intel: The header fields have changed between stage one and stage two, because of that manipulation and so on and so on. It goes through. There's only four stages, shown here, but 394 00:54:23.790 --> 00:54:33.360 Curt Beckmann, Intel: You know there's 12 are in the first gen and 20 in the second gen and this is just in eat grass or I mean sorry in ingress or egress as well. 395 00:54:34.410 --> 00:54:41.820 Curt Beckmann, Intel: And then they're put into a queue and and sent or or put into a 30 cent if it's on the growth side. 396 00:54:45.390 --> 00:54:57.030 Curt Beckmann, Intel: So the programmable parts are as we talked about already. It car carbs up those headers. We typically you know most of our users really are doing Ethernet and typically IP on top of Ethernet. 397 00:54:58.200 --> 00:55:05.520 Curt Beckmann, Intel: But sometimes, once you get past IP, then they start doing more creative things. I mean, you could do npls which I suppose is actually 398 00:55:06.150 --> 00:55:15.180 Curt Beckmann, Intel: In the layer between the internet and and IP, for example. So all these match action fields are quite generic. There's nothing particularly special about them. 399 00:55:16.080 --> 00:55:26.850 Curt Beckmann, Intel: And you can combine make very wide keys and the memories in here are allowed for exact matching longest brief so matching matching and 400 00:55:28.170 --> 00:55:39.420 Curt Beckmann, Intel: Hashing as well. So for large keys that's obviously useful because, you know, can't do a direct look up with a you know 64 bit key or 128 bit key. 401 00:55:41.610 --> 00:55:49.200 Curt Beckmann, Intel: Let's see, we talked about exact match T cam ternary match for those wide keys, where we don't want to 402 00:55:50.940 --> 00:55:54.660 Curt Beckmann, Intel: Don't care. Some bits 12 to 20 hardware stages. 403 00:56:00.840 --> 00:56:13.620 Curt Beckmann, Intel: Spilled out. No, sorry. So this is a we showed in Tupelo hash there we can have drops that us. There's another other like internal information that we can pass around. So for example, the queuing 404 00:56:14.280 --> 00:56:26.520 Curt Beckmann, Intel: State in the traffic manager is available and can be captured and shared we can respond to certain events like packet arrivals, but also time although we heard earlier that 405 00:56:28.020 --> 00:56:30.000 Curt Beckmann, Intel: timeouts are not appropriate for 406 00:56:32.520 --> 00:56:39.060 Curt Beckmann, Intel: Well at least tiling maybe for Omni extended I'm still learning the, the differences between you know what works and what doesn't. 407 00:56:39.690 --> 00:56:57.540 Curt Beckmann, Intel: But we can generate telemetry essentially in a variety of cases, we can do mirroring and so on, which is useful for obviously monitoring what's going on in your network and potentially that would be relevant for an omni extend an environment where we would send messages over 408 00:56:58.830 --> 00:57:08.490 Curt Beckmann, Intel: Maybe a traditional Ethernet link to to some other device based based on what's going on inside the the Omni extend sort of universe. 409 00:57:09.600 --> 00:57:12.270 Curt Beckmann, Intel: Those telemetry messages are are 410 00:57:13.350 --> 00:57:19.830 Curt Beckmann, Intel: Fully in the data path. So the control plane does not need to be involved, which is good because you can tend to overwhelm your control plane. 411 00:57:20.160 --> 00:57:28.530 Curt Beckmann, Intel: It's not good for the control plane and it's not good for the telemetry and so they can get sent off to something that's got more horsepower and can gather all that up. 412 00:57:29.580 --> 00:57:31.080 Curt Beckmann, Intel: And and then 413 00:57:33.570 --> 00:57:36.480 Curt Beckmann, Intel: Basically, draw the insights from the data as it arrives. 414 00:57:38.040 --> 00:57:52.200 Curt Beckmann, Intel: Here are some examples of creative applications that people have done with Tofino so switch ML. That's a machine learning done in the switch. Well, leveraging the capabilities of the switch. If you're into machine learning. 415 00:57:53.220 --> 00:57:58.680 Curt Beckmann, Intel: Training especially often the training data is so huge, you may be doing parallel 416 00:58:00.030 --> 00:58:17.010 Curt Beckmann, Intel: Parameter regression and multiple servers and they need to make their contributions all to each other in the classic model that's an n squared, kind of a problem, but by moving the parameter aggregation of these updates into the network effectively in the Tofino 417 00:58:18.180 --> 00:58:28.020 Curt Beckmann, Intel: Then that n squared problem turns to an order n problem. And they're actually like several percentage improvements. There's a secondary benefit as well by 418 00:58:28.590 --> 00:58:43.110 Curt Beckmann, Intel: When something's n squared. And it's interesting. It seemed like there was a parallel to the next 10 discussion earlier that things scale a lot better when things are order and then order n squared. So typically, the cluster size for switch ML are for 419 00:58:44.160 --> 00:58:44.670 Curt Beckmann, Intel: Large 420 00:58:45.750 --> 00:58:55.830 Curt Beckmann, Intel: Training clusters often maxes out around 16 so you tend to make the each of the servers as large as you can sometimes didn't get into 32 or 64 but when you have an order n. 421 00:58:58.830 --> 00:59:08.010 Curt Beckmann, Intel: Behavior, then you can scale that out to a much larger number without as much pain and as, as we've seen these. The, the databases, the data structures that are 422 00:59:08.550 --> 00:59:16.200 Curt Beckmann, Intel: Data lakes that people are crunching on are getting huge and they want to use as big a cluster as they can to get things to iterate quickly. 423 00:59:17.010 --> 00:59:29.370 Curt Beckmann, Intel: So that's a quite creative use case advanced congestion control. It does seem like our DMA was a part of this discussion as well. One of our end customers actually 424 00:59:30.630 --> 00:59:33.480 Curt Beckmann, Intel: Did some creative work, I'd mentioned the telemetry, which is 425 00:59:34.650 --> 00:59:46.350 Curt Beckmann, Intel: I empty. Well, they did a variation on telemetry it's telemetry usually goes to some, you know, external observer in this case they were using some of that telemetry capabilities to actually 426 00:59:47.520 --> 01:00:02.340 Curt Beckmann, Intel: Enhance the protocol improve the performance of rocky by either sending a head to the receiver information that would otherwise come later due to prioritization or sends back to the sender. 427 01:00:03.510 --> 01:00:11.940 Curt Beckmann, Intel: Sort of an early response. There's different variations of this that are being experimented with particularly around the rocky protocol very, you know, very 428 01:00:15.450 --> 01:00:16.410 Curt Beckmann, Intel: much improved. 429 01:00:17.850 --> 01:00:25.380 Curt Beckmann, Intel: Latency behavior, particularly on the endpoints, and as a result you want much you're much more sensitive to the latency in the network which is 430 01:00:26.460 --> 01:00:29.760 Curt Beckmann, Intel: Certainly very competitive on the Tofino but the ability to 431 01:00:31.380 --> 01:00:38.610 Curt Beckmann, Intel: To generate this advanced congestion control messages essentially programmatically without, you know, without responding silicon 432 01:00:39.420 --> 01:00:47.880 Curt Beckmann, Intel: That's pretty much a Tofino capability. If you want to look that up. That's an HP CC is the high precision congestion control paper. 433 01:00:48.450 --> 01:00:55.050 Curt Beckmann, Intel: You can look up switch and Mel as well. And then one of the other interesting ones is the telemetry have already mentioned before. That's another 434 01:00:55.890 --> 01:01:08.550 Curt Beckmann, Intel: Protocol, it's been in developed before.org developed it early. There's a IOM group now and I ETF that's working on that. The point I'm trying to make there is that telemetry is 435 01:01:09.600 --> 01:01:24.600 Curt Beckmann, Intel: There's lots of interest in it and it's slightly evolving and having a programmable devices is very useful. In this case, we haven't talked about, to my knowledge in that telemetry group, they're not talking yet about Omni extend and maybe they should and having a very programmable device. 436 01:01:26.190 --> 01:01:30.540 Curt Beckmann, Intel: lends itself to that so that we could do telemetry, you know, 437 01:01:31.680 --> 01:01:40.830 Curt Beckmann, Intel: Take Omni extend and then send telemetry message out some different port over Ethernet IP network. As I mentioned before, 438 01:01:43.200 --> 01:01:44.100 Curt Beckmann, Intel: So that was a 439 01:01:45.270 --> 01:01:50.460 Curt Beckmann, Intel: People in Tofino in a nutshell. Are there any other questions or 440 01:01:51.480 --> 01:01:52.530 Curt Beckmann, Intel: Any first questions. 441 01:01:55.560 --> 01:01:57.030 Curt Beckmann, Intel: Things I should have covered and didn't 442 01:01:58.950 --> 01:02:00.630 Jeffrey Osier-Mixon: There during have a chat. 443 01:02:01.080 --> 01:02:03.210 Ted Marena: Yeah, let's see, there's a question in the chat. 444 01:02:05.220 --> 01:02:07.290 Ted Marena: So I'm not sure. Can you see that 445 01:02:08.700 --> 01:02:09.300 Curt Beckmann, Intel: In a moment. 446 01:02:11.610 --> 01:02:16.680 Ted Marena: Let me, let me ask you, let me, I'll just ask it. Has anybody gotten P for to scale up to 447 01:02:16.680 --> 01:02:23.490 Ted Marena: 130 800 G for classification on FPGA days without eating up the entire FPGA real estate. 448 01:02:24.090 --> 01:02:27.540 Curt Beckmann, Intel: Uh huh. Well we on the FPGA side, I could 449 01:02:27.690 --> 01:02:31.590 Curt Beckmann, Intel: Probably get back to you on that. By talking to the FPGA group at Intel 450 01:02:33.240 --> 01:02:35.400 Curt Beckmann, Intel: Hundred G. I think so. I know. 451 01:02:38.400 --> 01:02:47.280 Curt Beckmann, Intel: I know that they've done at G. So then that was some some time ago. So I would think that they, they're able to do hundred g on an FPGA with people. 452 01:02:52.260 --> 01:02:52.650 Ted Marena: Okay. 453 01:02:54.930 --> 01:03:02.400 Ted Marena: So we can. We have a few more minutes here, we, you know, people want to ask questions, you can certainly ask it in the Q AMP. A or in the chat. 454 01:03:03.450 --> 01:03:13.050 Ted Marena: Via Mary, did you have any sort of like next steps are closing or did you just want to verbalize something I wasn't sure if you had sort of closing slides. 455 01:03:13.350 --> 01:03:14.250 Zvonimir Bandic: I go 456 01:03:14.910 --> 01:03:18.990 Zvonimir Bandic: Specific closing closing slide, but I would like to 457 01:03:20.190 --> 01:03:21.930 Zvonimir Bandic: Actually, I might I may even 458 01:03:23.610 --> 01:03:25.890 Ted Marena: So busy. Before you go on, 459 01:03:26.190 --> 01:03:29.220 Ted Marena: There's a question. The. Can you see the question in the Q AMP a 460 01:03:29.700 --> 01:03:35.070 Ted Marena: That was the female P for us or modified in the lab by Dan and Yvonne Amir 461 01:03:36.750 --> 01:03:38.160 Dejan: So I can probably take that one. 462 01:03:38.400 --> 01:03:38.970 Yeah. 463 01:03:40.290 --> 01:03:46.470 Dejan: Just to be clear, it was used or modified by neither Dan or I, this was some very bright people to 464 01:03:47.610 --> 01:03:54.900 Dejan: Work in my group. But yeah, we had not actually modify the switch right so the switch itself does not come. 465 01:03:55.500 --> 01:04:05.070 Dejan: With any kind of programming. Right. It's not secret specific like it. Actually, it will receive Ethernet frame packets, but it doesn't contain you know TCP IP or any, any of that. 466 01:04:05.850 --> 01:04:12.870 Dejan: Firmware on it. So all we did was we program the switch to a accepts a specific packet format. It's completely different for me than it 467 01:04:13.530 --> 01:04:18.510 Dejan: So all the changes that we did. Were you know in basically programming the chip itself. 468 01:04:18.960 --> 01:04:29.640 Dejan: And then there's a different project, which a lot of Miss showed. So we took the Tofino chip and just made a blackboard, the motherboard there instead of having front plate TCP ports has different kinds of connection. 469 01:04:31.080 --> 01:04:42.630 Dejan: Formats straight like this new so 51,002 with it's going to be PC agent five and expose that are bored, just as a test for different kind of short reach connections. Does that answer your question. 470 01:04:44.820 --> 01:04:45.120 Dejan: He says, 471 01:04:45.570 --> 01:04:46.110 Ted Marena: I think 472 01:04:46.440 --> 01:04:49.080 Zvonimir Bandic: Because Google has asked is muted. So 473 01:04:49.500 --> 01:04:55.020 Dejan: Yeah, so there's another person who wants to answer the question live. I'm not sure how to unmute 474 01:04:55.320 --> 01:04:57.540 Ted Marena: I know it. 475 01:04:58.290 --> 01:05:06.510 Ted Marena: Is a shin. And I think the other thing I would add is I believe that P for Cody is available on GitHub. 476 01:05:07.260 --> 01:05:12.330 Dejan: Correct. Well there. The code is very minimal at this point because we only have the forwarding code and the control plane setup. Right. 477 01:05:13.110 --> 01:05:17.190 Dejan: What we are really the most exciting thing would be to have the code that actually 478 01:05:17.880 --> 01:05:23.670 Dejan: You know stops these packets, the coherence messages and then acts as a directory. So the switch with Active Directory for coherence. 479 01:05:24.120 --> 01:05:33.660 Dejan: Because this would allow you to scale out way beyond you know for sockets or eight sockets that you can do today. So with once, which you can scale to 256 lanes. If you do one lane per 480 01:05:34.560 --> 01:05:45.960 Dejan: But we want. We have much higher ambitions right we have ambitious with 10s of thousands of nodes. So for this we would need actually much part of you for programming to limit this practical here, as mentioned these so we can scale better 481 01:05:48.720 --> 01:05:49.020 Ted Marena: Okay. 482 01:05:49.980 --> 01:05:50.220 Jeffrey Osier-Mixon: There was 483 01:05:50.370 --> 01:05:50.700 Curt Beckmann, Intel: Just 484 01:05:50.790 --> 01:05:56.730 Curt Beckmann, Intel: In the chat. I also want to clarify the point was made that when you get to switch. It's, it's 485 01:05:58.140 --> 01:06:05.040 Curt Beckmann, Intel: It doesn't have any of the Ethernet IP kind of it depends where you buy your switch. If you buy your switch the white box. Then, as, as mentioned, it's 486 01:06:06.930 --> 01:06:20.310 Curt Beckmann, Intel: A raw machine. And if you buy it from say Arista or or Cisco or something they they set a bunch of people in there for you so you you kind of have your choice. But yeah, the good news is, if you want to 487 01:06:20.790 --> 01:06:26.880 Curt Beckmann, Intel: If you want your own sandbox. It comes as a sandbox. If you get the right from the right source. 488 01:06:28.110 --> 01:06:37.650 Ted Marena: Yeah, there's another question. So see Lee had asked how does a switch OS compare with software defined network. 489 01:06:39.690 --> 01:06:41.880 Curt Beckmann, Intel: So talking about it from the switch side so 490 01:06:43.320 --> 01:06:55.020 Curt Beckmann, Intel: If you buy, as I mentioned, if you buy from Cisco Arista I think that sometimes they'll say that they they let you support Sonic. And my guess is that's a negotiation typically Sonic is Sonic is 491 01:06:55.440 --> 01:07:01.320 Curt Beckmann, Intel: I forget what the acronym is now centers are like asking you about before, but it's one of the open networking 492 01:07:02.340 --> 01:07:03.510 Curt Beckmann, Intel: protocols that is 493 01:07:04.860 --> 01:07:19.290 Curt Beckmann, Intel: Very SDN friendly and then there's also stratum that is even more SDN friendly. Both of those are available on two versions of that are available that run on white box Tofino switches. Now then, of course, the 494 01:07:20.850 --> 01:07:28.230 Curt Beckmann, Intel: The SDN controller is kind of your assignment or you know that's outside of the box but yeah Tofino was 495 01:07:29.490 --> 01:07:38.160 Curt Beckmann, Intel: Barefoot and Tokyo were essentially, you know, born in the SDN heyday, you know, they're really SDN heyday. So they're very SDN friendly. 496 01:07:43.680 --> 01:07:56.790 Ted Marena: Also, just to let everybody know I put in the chat box. One more time. The link for all these presentations. So let me sort of throw it over to me or I think you had something you were going to summarize 497 01:07:57.090 --> 01:07:59.820 Zvonimir Bandic: I would like, I don't know if I can share it again. 498 01:08:01.140 --> 01:08:01.950 Zvonimir Bandic: All right, they can't 499 01:08:02.130 --> 01:08:02.940 Ted Marena: You shouldn't be able to 500 01:08:03.990 --> 01:08:04.950 Dejan: Alright, so 501 01:08:05.160 --> 01:08:06.000 Zvonimir Bandic: I want to 502 01:08:08.760 --> 01:08:10.560 Zvonimir Bandic: showed this slide. 503 01:08:12.330 --> 01:08:15.750 Zvonimir Bandic: So these are the active. These are the currently and 504 01:08:16.800 --> 01:08:17.640 Difficult 505 01:08:21.660 --> 01:08:23.220 Zvonimir Bandic: Zoom problems. 506 01:08:26.550 --> 01:08:29.970 Zvonimir Bandic: So these are the current, current work groups. 507 01:08:31.080 --> 01:08:36.180 Zvonimir Bandic: Active in trips Alliance and the activities that we discussed today, which is a tiling 508 01:08:37.950 --> 01:08:51.390 Zvonimir Bandic: Tiling protocol stylization. The only extend which is starting over internet and also AI be specification for chip. Let's are discussing the interconnect word group. So I want to encourage 509 01:08:52.740 --> 01:08:58.740 Zvonimir Bandic: Individuals who are interested to visit chips allows get up and and 510 01:08:59.970 --> 01:09:13.800 Zvonimir Bandic: All these activities are open source activities. So we encourage people to participate and and our, our get up ideas, simple, just chips allies, and in addition to that, 511 01:09:16.740 --> 01:09:24.900 Zvonimir Bandic: Companies can join as members of chips alliance and reach many more resources and collaboration as well as a face to face. 512 01:09:26.400 --> 01:09:34.710 Zvonimir Bandic: Well, not now. Not all of our face to face, but certainly online meetings every two weeks, or the development of specifications. 513 01:09:37.800 --> 01:09:46.470 Ted Marena: Great. Alright, so, um, Wesley or Kurt. Did you guys have anything else you wanted to sort of summarize or China here. 514 01:09:46.860 --> 01:09:56.310 Curt Beckmann, Intel: A good mention of the open source. I don't have a slide related to it but P for. There's a lot of people are on GitHub. And there's a lot of openness in that community as well. 515 01:09:57.000 --> 01:10:05.850 Curt Beckmann, Intel: Just recently, when the acquisition of of barefoot by Intel has sort of supported additional openness. 516 01:10:06.750 --> 01:10:11.280 Curt Beckmann, Intel: I think there was some things that barefoot, so I joined barefoot just after the Intel acquisition 517 01:10:12.120 --> 01:10:20.160 Curt Beckmann, Intel: And what I know about barefoot before then is, is what I've heard from others that they they kept some stuff a little bit closed. 518 01:10:20.940 --> 01:10:34.710 Curt Beckmann, Intel: What what they felt for their own protection, they don't feel that sensitive to that now. And so they're opening up even more. And there is a big open source community around people running on Tofino 519 01:10:38.940 --> 01:10:39.420 Great. 520 01:10:41.220 --> 01:10:53.370 Ted Marena: Okay. So Wesley, or let's see. Maybe we'll just hang here for another minute. Let's see. We there is another question. Speaking of development, the open wire. They're just a couple of commits and the tiling repo. 521 01:10:55.350 --> 01:10:55.560 Ted Marena: This 522 01:10:55.710 --> 01:10:56.250 Zvonimir Bandic: Is what I can 523 01:10:57.090 --> 01:10:57.930 Zvonimir Bandic: I can answer that. 524 01:10:58.140 --> 01:10:58.620 Ted Marena: Go ahead. 525 01:10:59.010 --> 01:10:59.580 Wesley Terpstra: I didn't even know 526 01:11:02.130 --> 01:11:02.790 Zvonimir Bandic: Nothing has 527 01:11:03.870 --> 01:11:06.120 Zvonimir Bandic: Just recently been moved from 528 01:11:07.530 --> 01:11:07.950 Zvonimir Bandic: From 529 01:11:09.270 --> 01:11:14.070 Zvonimir Bandic: From the previous report so it's it's relatively new to the chips alliance. 530 01:11:16.140 --> 01:11:20.790 Zvonimir Bandic: But it's been around for, you know, super long time obviously more than five years. 531 01:11:23.640 --> 01:11:25.440 Ted Marena: And there's also a question. 532 01:11:26.460 --> 01:11:38.670 Ted Marena: Tile links not looks nice, but only use for risk five to build an. Soc. How shall we use the interconnect to work with other existing system IPs that were provided by other vendors in arm camp, for example. 533 01:11:39.180 --> 01:11:46.410 Wesley Terpstra: Of this one's really easy. There are open source implementations of piling converters to ABB HP xi. 534 01:11:46.950 --> 01:11:56.910 Wesley Terpstra: So if you want to do this, you can just wrap up those IPS with a converter and then attach it the talent network. We did that with all sorts of IP that we acquire when we build an actual chip. 535 01:11:58.860 --> 01:12:14.130 Ted Marena: And also for Omni extend. That's an open specification. So if someone wants to put it on a processor, other than a risk five processor, then that's also possible. So it's a it's an open standard. That's one of the key differentiators 536 01:12:14.490 --> 01:12:20.910 Zvonimir Bandic: Grew up in I know several startup companies that are using tiling in the non responsive tips. 537 01:12:22.710 --> 01:12:22.860 Oh, 538 01:12:24.180 --> 01:12:27.900 Zvonimir Bandic: It's, it's already. It's already happening. I don't know if they use the 539 01:12:28.950 --> 01:12:35.700 Zvonimir Bandic: Open Source bus adapters that that's mostly mentioned or some other ones, but it's it's being used. 540 01:12:49.260 --> 01:12:50.040 Zvonimir Bandic: They're still there. 541 01:12:50.640 --> 01:12:54.150 Ted Marena: I'm still here. Yeah, I just figured I'd give everybody another second before 542 01:12:55.020 --> 01:12:56.610 Ted Marena: You initially sign off. 543 01:12:58.290 --> 01:12:59.220 Ted Marena: This for me. 544 01:12:59.460 --> 01:13:00.270 Zvonimir Bandic: Is for you. 545 01:13:00.360 --> 01:13:04.770 Ted Marena: Well, the last thing you want is to clone me I can absolutely assure you that 546 01:13:06.240 --> 01:13:19.110 Ted Marena: Okay. So I think, you know, we have recorded this so we'll be posting it and I already sent out the link for the the box folder link for everything. 547 01:13:20.400 --> 01:13:33.060 Ted Marena: And so I don't think there's anything else we can hang here for just a few more seconds to see if anyone has a question. And if not, we can just, you know, wish everyone a good night thank everybody for attending. 548 01:13:39.750 --> 01:13:40.380 Zvonimir Bandic: Thank you. Good. 549 01:13:41.190 --> 01:13:48.720 Ted Marena: Thank you. Thanks. Kurt thanks fun Amir Thanks Dan. Thanks. Wesley, Jeff, I really appreciate your help on this as well. 550 01:13:51.030 --> 01:13:51.780 Dejan: Thank you everyone. 551 01:13:52.560 --> 01:13:53.220 Ted Marena: Your say 552 01:13:53.550 --> 01:13:55.260 Ted Marena: Yep. Bye bye. Have a good night. 553 01:13:55.560 --> 01:13:56.430 Zvonimir Bandic: Thank you. Bye bye. 554 01:14:27.360 --> 01:14:28.470 Ted Marena: Thanks, Jeff. For our Bye bye. 555 01:14:30.270 --> 01:14:30.810 Jeffrey Osier-Mixon: See you later. 556 01:14:31.080 --> 01:14:32.130 Zvonimir Bandic: Yeah I stayed 557 01:14:32.400 --> 01:14:33.510 Ted Marena: I stayed to stop the 558 01:14:33.510 --> 01:14:34.230 Recording 559 01:14:35.760 --> 01:14:38.670 Zvonimir Bandic: They're still comments coming on a on a on a 560 01:14:38.820 --> 01:14:39.930 Zvonimir Bandic: On a zoom chat so 561 01:14:40.110 --> 01:14:42.510 Ted Marena: Oh really, okay I hanging