1

Configure Two NICs for Different Networks on the Same Machine

Question

How do I correctly configure two NICs on separate networks? The goal is is to have all traffic returned from the interface it was received on. If anyone can provide an example of what the routing table should look like I can get the Netplan config created from there.

Update 1

  • Added hardware diagram
  • simplified config.yaml which solved ping out from eno2
  • still not returning traffic correctly using iperf

Update #2

  • small clarification in goal (two NICs, two subnets)
  • request example of ip r s output as a target.

Problem

Both NICs are reachable, however return traffic is only routed through one of them. Trying to specify tables in netplan has not worked but I am probably messing up the routing policies in the config yaml.

Target Configuration

I would like all traffic to be segregated between the NICs. they do not need to be isolated, but if the traffic is on the LAB lan, stay on the bonded NIC, and if it's on the MGMT lan, stay on the eno2 device.

┌─────────────────┐      ┌──────────────────────┐
│   y.y.y.0/24    │      │   x.x.x.0/24         │
│   mgmt network  │      │   lab network        │
└─┬───────────────┘      └─┬──┬─────────┬──┬────┘
  │                        │┼┼│         │┼┼│
  │                        │┼┼│         │┼┼│
  │                        │┼┼│      ┌─=┴==┴=─────┐ 
  │                        │┼┼│      | file server|
  │                        │┼┼│      └────────────┘
┌─=───────────────────────=┴==┴=─────────────────┐
│ y.y.y.105               x.x.x.71               │
│                                                │
│              GPU Server                        │
└────────────────────────────────────────────────┘

Hardware Diagram

This is the topology of the relevant hardware in the stack.

Network Architecture Diagram

Netplan Config YAML

#50-netplan-config.yaml
network:
  version: 2
  ethernets:
    # management network y.y.y.0/24
    eno2:
      dhcp4: no
      dhcp6: no
      addresses: [y.y.y.105/24]
      routes:
        - to: default
          via: y.y.y.1

      nameservers:
        addresses: [y.y.y.1, 1.1.1.1, 1.0.0.1]
        search: [local, lab]

  bridges:
    # lab network x.x.x.0/24
    br0:
      dhcp4: no
      dhcp6: no
      interfaces: [bond0]
      addresses: [x.x.x.71/24]

      nameservers:
        addresses: [x.x.x.1, 1.1.1.1, 1.0.0.1]
        search: [local, lab]

  bonds:
    bond0:
      interfaces: [enp129s0f0, enp129s0f1, enp129s0f2, enp129s0f3]
      parameters:
        lacp-rate: fast
        mode: 802.3ad
        transmit-hash-policy: layer3+4
        mii-monitor-interval: 100
        ad-select: bandwidth

    # interfaces for bond0
    enp129s0f0:
      dhcp4: no
      dhcp6: no

    enp129s0f1:
      dhcp4: no
      dhcp6: no

    enp129s0f2:
      dhcp4: no
      dhcp6: no

    enp129s0f3:
      dhcp4: no
      dhcp6: no

Testing

Try to ping out from server

Works on eno2

PING 1.1.1.1 (1.1.1.1) from y.y.y.105 eno2: 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=52 time=10.3 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=52 time=10.4 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=52 time=10.3 ms

--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 10.309/10.324/10.353/0.020 ms

Works on br0

$ ping -c 3 -I br0 1.1.1.1
PING 1.1.1.1 (1.1.1.1) from x.x.x.71 br0: 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=52 time=10.1 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=52 time=10.3 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=52 time=10.6 ms

--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 10.102/10.339/10.573/0.192 ms

Check routing table(s)

$ ip route show
default via y.y.y.1 dev eno2 proto static
blackhole 10.1.228.192/26 proto 80
x.x.x.0/24 dev br0 proto kernel scope link src x.x.x.71
y.y.y.0/24 dev eno2 proto kernel scope link src y.y.y.105
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown

Test MGMT Network

Test eno2 interface with iperf and nload. Results show traffic to GPU server is received on correct interface, but return traffic is via bond0 (br0).

❯ iperf -c y.y.y.105 -r -f G
------------------------------------------------------------
Client connecting to y.y.y.105, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  1] local 172.30.30.229 port 60716 connected with y.y.y.105 port 5001 (icwnd/mss/irtt=14/1448/5000)
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.31 sec  0.150 GBytes  0.015 GBytes/sec
[  2] local 172.30.30.229 port 5001 connected with x.x.x.71 port 53436
[ ID] Interval       Transfer     Bandwidth
[  2] 0.00-10.12 sec  0.171 GBytes  0.017 GBytes/sec
$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  1] local y.y.y.105 port 5001 connected with 172.30.30.229 port 58370
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-10.2382 sec   161 MBytes   132 Mbits/sec
------------------------------------------------------------
Client connecting to 172.30.30.229, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ *2] local x.x.x.71 port 36346 connected with 172.30.30.229 port 5001 (reverse)
[ ID] Interval       Transfer     Bandwidth
[ *2] 0.0000-10.2344 sec   151 MBytes   124 Mbits/sec
$ nload eno2

Device eno2 [y.y.y.105] (1/1):
=============================================================================================================
Incoming:
                         ######################
                         ######################
                         ######################
                         ######################                           Curr: 1.49 kBit/s
                         ######################                           Avg: 20.73 MBit/s
                         ######################                           Min: 1.02 kBit/s
                         ######################                           Max: 190.57 MBit/s
                         ######################                           Ttl: 676.95 MByte
Outgoing:




                                                                          Curr: 0.00 Bit/s
                                                                          Avg: 0.00 Bit/s
                                                                          Min: 0.00 Bit/s
                                                                          Max: 0.00 Bit/s
                                                                          Ttl: 9.99 MByte

note: nload br0 and nload bond0 open the same device in nload window

$ nload br0

Device bond0 (1/15):
==============================================================================================================
Incoming:
                                                                           Curr: 3.84 kBit/s
                                                                           Avg: 192.22 kBit/s
                                                                           Min: 952.00 Bit/s
                                                 .|.                       Max: 1.81 MBit/s
                                            .###################|.         Ttl: 7.30 MByte
Outgoing:
                                            ######################
                                            ######################
                                            ######################
                                            ######################
                                            ######################
                                            ######################         Curr: 21.80 kBit/s
                                            ######################         Avg: 21.51 MBit/s
                                     .      ######################         Min: 4.16 kBit/s
                         |.|###||#####|.....######################         Max: 162.19 MBit/s
                        |#########################################         Ttl: 694.43 MByte

Test Lab Network

Meanwhile, network traffic is as expected on the br0 interface.

❯ iperf -c x.x.x.71 -r -f G
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to x.x.x.71, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  1] local 172.30.30.229 port 59950 connected with x.x.x.71 port 5001 (icwnd/mss/irtt=14/1448/3000)
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.12 sec  0.159 GBytes  0.016 GBytes/sec
[  2] local 172.30.30.229 port 5001 connected with x.x.x.71 port 33270
[ ID] Interval       Transfer     Bandwidth
[  2] 0.00-10.20 sec  0.167 GBytes  0.016 GBytes/sec
$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  1] local x.x.x.71 port 5001 connected with 172.30.30.229 port 59950
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-10.1135 sec   163 MBytes   135 Mbits/sec
------------------------------------------------------------
Client connecting to 172.30.30.229, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ *2] local x.x.x.71 port 33270 connected with 172.30.30.229 port 5001 (reverse)
[ ID] Interval       Transfer     Bandwidth
[ *2] 0.0000-10.2124 sec   171 MBytes   140 Mbits/sec
$ nload br0

Device bond0 (1/12):
=============================================================================================================
Incoming:
                        ######################
                        ######################
                        ######################
                        ######################                            Curr: 3.85 kBit/s
                        ######################                            Avg: 44.04 MBit/s
                        ######################                            Min: 3.85 kBit/s
                        ######################     .                      Max: 174.40 MBit/s
                        ######################.|#||##.|###|||||||.        Ttl: 3.35 GByte
Outgoing:
                                            ######################
                                            ######################
                                            ######################
                                            ######################
                                            ######################        Curr: 13.89 kBit/s
                                            ######################        Avg: 47.11 MBit/s
                          .  ....  .   ..   ######################        Min: 4.16 kBit/s
                         .##########||####||######################.       Max: 165.06 MBit/s
                        .##########################################       Ttl: 2.86 GByte
Ax0n
  • 119
  • 1
  • 4
  • I'm not going to attempt to answer this b/c I am unfamiliar with netplan, but... I will say I have never seen separate "routing tables" on a single device. All the routing goes through one table and you configure filters to determine where traffic is supposed to go. Even with 100 NICs, you don't need more than one routing table. – rfportilla Oct 22 '22 at 01:03
  • thanks @rfportilla. Netpan was throwing an error suggesting multiple tables, and its in their documentation too. :/ I will test some rules without specific tables next. I appreciate the comment. https://netplan.io/examples (towards the bottom in the "Configuring source routing" section) – Ax0n Oct 22 '22 at 01:08
  • First of all, you routes are sort of wrong. You don't need / want to use a/the gateway (`via`) for the prefix route. Also the `from`s are pointless when they are the same. Here `from` refers to the replying source address (`.71` / `.105`). However, I'm not sure if the `from` approach would work anyway. (It has something to do with whether the source address has been set on the traffics yet when the lookup is performed, I think.) At least with some cases / programs, you'll need a "stateful" approach instead (`fwmark`, with the help of iptables / nftables). – Tom Yan Oct 22 '22 at 01:44
  • This seems like a truly horrendous way to handle routing. What is wrong with configuring zebra/quagga/ffr/ospf etc on each `router` and let that software deal with it all. – Bib Oct 22 '22 at 10:44
  • Thanks @TomYan, I was able to get ping to work going out from each router but I still cant get the response to come from the intended nic. – Ax0n Oct 22 '22 at 14:38
  • @Bib, the reason I am configuring this here is so the GPU machine can take advantage of the higher throughput to the file server. I will update the ascii diagram. – Ax0n Oct 22 '22 at 14:39
  • Yes, and load the routing protocols on the GPU machine. You are needlessly making it difficult. It you have more than 1 network, routing protocols should be your first point of call. – Bib Oct 22 '22 at 14:55
  • Hmm wait. I thought the two relevant (logical) interfaces were connected to like the same network (I mean like the same LAN / broadcast domain). In that case you shouldn't need policy routing for LAN traffics at all. So is the segregation needed for replies to like traffics from the WAN side / the Internet? In that case the route you need in the alternate route tables are `default` routes, and they should be *indirect* route consisting the corresponding gateway (`via`). (But you should also have *direct* routes that have no `via` for their prefixes / LAN subnets in the tables.) – Tom Yan Oct 23 '22 at 01:22
  • @TomYan - they are separate physical NICs on separate subnets. Both have internet access (but will have different rules in the firewall). Can you answer this question with what a routing table should look like? I think I can get to the config correct from there. e.g., if I run `ip r s` what should I see? the goal is to have all traffic returned from the interface it was received on. - thanks again – Ax0n Oct 23 '22 at 02:50

1 Answers1

0

If I read your diagram correctly, the GPU and fileserver is on the same L2 network, so there really isn't any routing involved.

The LACP policy on the switch has to be configured to take L4 in account for the hashing aswell, so that different transport streams will target different bond members. I assume you have multiple TCP/UDP sessions here, or else it will not distribute the load at all.

Dag S
  • 1