class: center, middle, contrast # P4 in the wild: Line-rate packet forwarding for the SCION future Internet architecture Kamila Součková Master thesis (in progress) SDN Switzerland, 2019-07-05 --- class: contrast, expanded, middle, narrow ###### Background .indenthalf[**What is SCION?**] ###### My project 1. **Aims** 2. **Challenges** 3. **Current status** 4. **Future plans** --- # SCION future Internet architecture -- → changing the Internet is really hard, so… Why? -- .width100[data:image/s3,"s3://crabby-images/ca1ce/ca1ce9f73436d4642d7aabec80eef5cc694430b9" alt="bgp route leak article screenshot"] --- .width100[data:image/s3,"s3://crabby-images/ca1ce/ca1ce9f73436d4642d7aabec80eef5cc694430b9" alt="bgp route leak article screenshot"] .pushuplots[ .small.center[last week's BGP route leak only one of many examples of the current Internet's shortcomings] SCION is designed for **.hl[route control]**, **.hl[failure isolation]**, and **.hl[explicit trust information]** for end-to-end communication ] --- class: no-margins # SCION _.hl[Scalability]_ _.hl[Control]_ _.hl[Isolation]_ _.hl[on Next-generation Networks]_ --- count: false class: no-margins # SCION _.hl[Scalability]_ * packet-carried forwarding state * hierarchical design _.hl[Control]_ * end-host selects path * ISPs decide available paths _.hl[Isolation]_ * Isolation Domains (failures stay within one) * built-in DoS protection .hl[_on Next-generation Networks]_ * new control plane & data plane (replaces IP + BGP) * endhost-controlled multipath for free :-) --- class: expanded # Isolation Domains (ISDs) .floatright.width65[data:image/s3,"s3://crabby-images/c6496/c6496265cc1046c933079536c054dc5a80d8fbe7" alt="ISDs"] * e.g. ISD ≈ country * PKI per ISD * routing within ISD independent * managed by **core ASes**: PKI, inter-ISD control plane * non-core ASes: only intra-ISD (simple) --- class: expanded1 .floatright.width50[data:image/s3,"s3://crabby-images/42aa6/42aa6f01a0b078698c6ab758c8c3adfec6735ac1" alt="PCBs"] # Path discovery * core ASes create *path construction beacons* (PCBs) & flood * hierarchical: * inter-ISD: among core ASes * intra-ISD: only downstream * AS receiving beacon adds itself & forwards it * ⇒ PCB contains known good path to core * adds its signature ⇒ PCBs can be verified --- # Path discovery * AS chooses how to extend / what to forward ⇒ enables routing policies * available *path segments* registered to *path servers* * end hosts request from path servers * segments combined into end-to-end paths: .center[ .displayinlineblock.width30[data:image/s3,"s3://crabby-images/a2a19/a2a19562d52ae6cf8801096fdd6bdb81adfd887b" alt="segments example 1"] .displayinlineblock.width30[data:image/s3,"s3://crabby-images/1e6e0/1e6e0829b7578490cbd51918a50693d1c5ac68b2" alt="segments example 2"] .displayinlineblock.width30[data:image/s3,"s3://crabby-images/c0e60/c0e605b9c4631158855f15e5f00f46d51913754c" alt="segments example 2"] ] --- # Path discovery every PCB contains *hop fields* for path construction: * ingress, egress interface * timestamp + expiration time * chained cryptographic MACs with AS-specific keys ⇒ paths cannot be "invented" .center.width90[data:image/s3,"s3://crabby-images/d499c/d499cadf0053cfaf0c06a009afdd50e33ce58b92" alt="HF MACs chaining"] --- class: expanded # Packet forwarding .floatright.width60[data:image/s3,"s3://crabby-images/21ee5/21ee5e72761eda6a534fc47c0111335bd82e2c97" alt="forwarding path packet"] * path in packet header * every border router processes only its own *hop field* * checks to ensure path authenticity --- class: contrast, expanded, middle, narrow ###### SCION border router in P4 1. **Aims** 2. **Challenges** 3. **Current status** 3. **Future plans** --- exclude: true class: expanded # Aims * **Ready-to-deploy SCION border router** forwarding at 40Gbps or more * **SCION as a library** Modular, portable, high-performance P4 code for parsing, verification, and forwarding * **Guidelines for high-speed P4** What have we learned about high-speed P4-based packet processing on FPGAs? * **Optimising the SCION protocol for HW** How can we adjust SCION to enable more efficient implementations in HW? --- class: contrast, expanded, middle, center # Aims --- exclude: true # #1: Prod-ready SCION BR .center.width100[data:image/s3,"s3://crabby-images/350ec/350ec74e497e33d37490aa620427cf53346d5414" alt="Parts of the project"] --- class: expanded1 # #1: Ready-to-deploy SCION BR * forwarding at **40Gbps or more** * usable with real traffic * integrated with existing SCION infra * control plane * monitoring / metrics --- # #2: SCION as a library .center[Modular, portable, high-performance P4 code for parsing, verification, and forwarding] ??? Now, why is this a good idea? 1. because modular code is easier to deal with 2. [NEXT] because you can base other things on it --- .center.width100[data:image/s3,"s3://crabby-images/350ec/350ec74e497e33d37490aa620427cf53346d5414" alt="Parts of the project"] ??? you can take my BR and pick and choose which parts you want: --- .center.width100[data:image/s3,"s3://crabby-images/2e88f/2e88f47209552c1c872a570041032022ffbdda94" alt="Parts of the project"] ??? bring your own AES or even use a completely different MAC validation scheme --- .center.width100[data:image/s3,"s3://crabby-images/ceb0b/ceb0bce6dea11927a992416280437ce59e229a52" alt="Parts of the project"] ??? change parser: swap your L2, use different encaps, or even parse the IFIDs in HFs in some special way --- count: false .center.width100[data:image/s3,"s3://crabby-images/350ec/350ec74e497e33d37490aa620427cf53346d5414" alt="Parts of the project"] --- .center.width100[data:image/s3,"s3://crabby-images/17504/17504d372b497ee0973403ec6db2650988958cde" alt="NMS"] .center[Example: Network monitoring system] --- .center.width100[data:image/s3,"s3://crabby-images/de5a2/de5a2474bba003dfe071ba37a06d72d45d0d71aa" alt="NMS"] .center[Example: Special-purpose end host with a P4-capable SmartNIC] ??? You could have a special-purpose end host with a P4-enabled SmartNIC, and do some of the data processing straight in the NIC => use my parser and deparser in a completely different architecture --- exclude: true # #3: Guidelines for high-speed P4 **a) P4+NetFPGA project template suitable for large projects** * software-engineer-friendly * simple way to add tests * ready for multi-platform usage * already used by 2 other ETH students --- exclude: true # #3: Guidelines for high-speed P4 **
P4+NetFPGA guidelines + project template suitable for large projects
** .center.huge[:-(] Discarded, as the current P4 toolchain for NetFPGA is **not** suitable for production -- see Challenges for details. --- # #3: Guidelines for high-speed P4 **tips that help software engineers write P4 code that performs well on FPGA-based hardware** **meeting timing requirements** a frequent problem ⇒ check the **critical path** in your implemented design * find it in the timing report * you'll have to guess about the correspondence to the P4 code, but it can be done --- # #3: Guidelines for high-speed P4 some examples: * to meet timing, avoid long data paths * `inout` parameters get compiled into long paths ⇒ avoid them in critical spots * rewrite code like this:
.floatleft[ ```sh if [long computation 1] then [long computation 2] ``` ] .floatright[ ```sh [computation 1]; [computation 2]; [combine the results] ``` ]
⇒
* CAM tables (`exact` match) are relatively expensive ⇒ prefer to combine multiple lookups into one or completely avoid tables ??? and where to look to find out which spots are critical --- # #4: Optimising the SCION protocol for HW .center[ How can we adjust SCION to enable more efficient implementations in HW? ⇒ to answer, let's talk about the challenges... ] ??? so, those are the aims, now, how do we meet them? what are the most challenging aspects of the project? --- class: contrast, expanded, middle, center # Challenges ??? --- # Making progress quickly: Reuse, Reduce, Recycle .center.width80.pushup[data:image/s3,"s3://crabby-images/a69c4/a69c400327a5d7def0ebc59dd3f470779693ad1c" alt="Parts of the project"] transparently pass unhandled packets to SW through 1:1 "fake" network interfaces (really DMA) ⇒ iteratively move functionality into HW --- layout: true # Implementing the parser SCION header: .floatleft.margin0[ data:image/s3,"s3://crabby-images/c7bfa/c7bfa97a2493c2865085bb8c1bfa7f99720ecfbf" alt="header stack" ] --- ??? my experience with NetFPGA, this would be different with a different compiler --- .floatright.margin0[ First idea: Use header stacks: ```c struct ScionHeader_t { … HopField_h[32] seg1_hfs; … } ``` ] --- .floatright.margin0[ First idea: Use header stacks: ```c struct ScionHeader_t { … HopField_h[32] seg1_hfs; … } ``` .red[NetFPGA compiler does not support header stacks] ] --- layout: false # Implementing the parser Fix: We don't need to parse the whole path, we just need to save it so we can emit it later ⇒ use a `varbit` field .floatleft.margin0.width30[ data:image/s3,"s3://crabby-images/72d1e/72d1e05cf7a05f50056b5bd645f1e8ae2a24bee4" alt="header stack" ] -- .floatright.margin0.red[ NetFPGA compiler does not support `varbit` ] --- # Implementing the parser Fix: Don't even save the path: use `packet_mod`: ```c parser ExModDeparser(packet_mod p, in headers_t h) { state start { * p.update(h.ethernet); transition select(h.ethernet.ethertype) { ETHERTYPE_IPV4: deparse_ipv4; ETHERTYPE_IPV6: deparse_ipv6; } } … } ``` Xilinx extension, **not** standard P4 — but standard P4 can use `varbit` --- # Implementing the parser Fix: Don't even save the path: use `packet_mod`: 1. modify NetFPGA design to use the `packet_mod`-enabled `XilinxStreamSwitch` architecture .light[\* experimental :-)] 2. skip over the path with `packet.advance(size)` -- .red[NetFPGA requires even `packet.advance(size)` to be a compile-time constant] -- .center[Therefore…] --- # Implementing the parser .light[Fix: Don't even save the path: use `packet_mod`:] 1. .light[modify NetFPGA design to use the `packet_mod`-enabled `XilinxStreamSwitch` architecture] .light[\* experimental :-)] 2. .light[skip over the path with `packet.advance(size)`] 3. **Create a lot of separate sub-parsers: one for skipping 1 HF, one for skipping 2, etc.** Select the correct sub-parser at runtime, but the jump size is fixed at compile time. --- # Implementing the parser Create a lot of separate sub-parsers: up to .hl[max length]. Problem: To support reasonably long paths (~64 HFs), we need a lot of sub-parsers. ⇒ Requires a lot of FPGA area + RAM to build it. --- layout: true # Implementing the parser Create a lot of separate sub-parsers: up to .hl[max length]. Problem: To support reasonably long paths (~64 HFs), we need a lot of sub-parsers. ⇒ Requires a lot of FPGA area + RAM to build it. Fix: Two stages of max length 8: --- count: false .width100[data:image/s3,"s3://crabby-images/18966/1896669cfa3cf4c17f0d3a2c764a77c77e306214" alt="sqrt1"] --- count: false .width100[data:image/s3,"s3://crabby-images/3ab8a/3ab8a7a3937d01858a8dec93d0a2cbf3c4c085e0" alt="sqrt2"] --- count: false .width100[data:image/s3,"s3://crabby-images/994f7/994f73a4356d90a238eb95cfa5bdb1e0d9c162ae" alt="sqrt3"] --- count: false .width100[data:image/s3,"s3://crabby-images/43ef0/43ef09600137af0ae27eba819c89ce2261b693e4" alt="sqrt4"] --- layout: false class: expanded1 # Portability * keeping the code portable despite the differences in P4 support * P4 should be target-independent, but… * different externs * different compiler limitations (ahem) * making it simple to compile this BR for a different platform --- # Portability Pieces of the solution: * preprocessor `#define`s decouple platform from features; code depends only on features * well-thought-out repo structure => minimise work needed to add new platform * do it & document it Planning to support at least: * NetFPGA * P4lang's software switch * later: other FPGA-based NICs --- class: expanded1 # Performance / Timing * at 200MHz, the timing is very tight * NetFPGA compiler has many bugs * workarounds cost extra resources * complicated by P4 ⇒ HDL translation * no direct control over the resulting design --- exclude: true # Making progress quickly * modular design makes it easier to test incremental changes and locate bugs * tackling the hard bits first ⇒ I can't get stuck * **the NetFPGA's P4 stack is pretty much a PoC**; documentation is generally severely lacking therefore: * try things ASAP * change plan if something doesn't work * **guess** --- # Suggestions for SCION protocol *
require few table lookups
✓ *
variable lengths are bad
✓ * SCION avoids them where possible * define maximum sizes: * max path length * max HF size * make things explicit * include end host address length in header * consider re-thinking MAC chaining for peering paths * further research into the tradeoffs is needed --- class: contrast, expanded, middle, center # Status --- class: bottom .center.width100[data:image/s3,"s3://crabby-images/15c82/15c828c296069bcd9c47e87c78e707bb4b3c62e6" alt="Status"] --- class: expanded1 # Future plans * SCION packet header redesign * production-readiness * handle all the cases * with a different P4 compiler... * power requirements optimisation * compare with IP * make it faster! * planning 1Tbps using multiple FPGA-enabled NICs --- exclude: true # Current status The look of this slide: cases pic from SCION book that shows which cases I implement and then on click a big number saying 40Gbps (hopefully :D) --- class: contrast, center, middle # Say hi! Email: **scion@kamila.is** Twitter: **@anotherkamila** Matrix: **@kamila:unchat.cat**