Intro to DMVPN

Although this has been well blogged before, as we’ll be rolling DMVPN out where I work I’d like to record the steps I took to configure and verify a working example in GNS3.

Software

The first hurdle I came across is that the image I’d upgraded my virtual Cisco 3725s to (
c3725-adventerprisek9-mz124-25.bin, in order to support EIGRP for IPv6) doesn’t have everything you need for DMVPN. Although it seems to be possible to get things working, the ‘show dmvpn’ command in missing. I rolled back to c3725-adventerprisek9-mz.124-15.T14.bin and all was well.

Lab

Here is a screenshot of my GNS3 topology. Linknets between routers are shown in green, client networks in dark blue. I’ve included a box showing the multipoint GRE tunnel peers in light blue.

Screen Shot 2014-02-24 at 11.33.29

Base Config

I’ve attached full configs to the post. Prior to configuring DMVPN I set up OSPF on all routers. The intention here is to mimic each sites’ Internet connection. I probably should have used BGP or a static route to 0.0.0.0/0 via R5 on each router but at this stage I simply wanted all routers to be aware of all green subnets so that the mGRE tunnel would come up.

interface FastEthernet0/0
 description R1
 ip address 192.168.15.2 255.255.255.252
!
interface FastEthernet0/1
 description R2
 ip address 192.168.25.2 255.255.255.252
!
interface FastEthernet1/0
 description R3
 ip address 192.168.35.2 255.255.255.252
!
interface FastEthernet2/0
 description R4
 ip address 192.168.45.2 255.255.255.252
!
router ospf 100
 router-id 5.5.5.5
 network 0.0.0.0 255.255.255.255 area 0
 default-information originate
!
ip route 0.0.0.0 0.0.0.0 Null0

A sister config was added to each router (without the default-information originate), lighting up the protocol on fa 0/0 only, that is matching the local green subnet. A quick look at the OSPF entires in R1s routing table shows that we’re ready:

R1#show ip route ospf
192.168.45.0/30 is subnetted, 1 subnets
O 192.168.45.0 [110/11] via 192.168.15.2, 01:47:25, FastEthernet0/0
192.168.25.0/30 is subnetted, 1 subnets
O 192.168.25.0 [110/20] via 192.168.15.2, 01:47:25, FastEthernet0/0
192.168.35.0/30 is subnetted, 1 subnets
O 192.168.35.0 [110/11] via 192.168.15.2, 01:47:25, FastEthernet0/0
O*E2 0.0.0.0/0 [110/1] via 192.168.15.2, 01:47:25, FastEthernet0/0

Multipoint GRE Tunnel

I’ve selected 172.16.0.0/24 for my tunnel interfaces. The hub and all spokes will need a logical interface in this subnet to act as the local tunnel ingress/egress. Each router gets a config like this, the example here being R1. Note that there is no tunnel destination on any router.

interface Tunnel0
 ip address 172.16.0.1 255.255.255.0
 tunnel source 192.168.15.1
 tunnel mode gre multipoint

R2 would have a tunnel source of 192.168.25.1 etc.

The multipoint GRE tunnel is treated like a Non Broadcast Multi Access (NBMA) network, much like point-to-multipoint frame-relay.

NHRP

The point of DMVPN is that any spoke should be able to bring up a tunnel to any other spoke on demand. We need a mechanism to define our hub router, and for spoke routers to be able to resolve the physical IP address in use by another spoke, enter Next Hop Resolution Protocol. Hub routers – NHRP clients – query the spoke router or Next Hop Server (NHS) to learn the address of another hub router with which they need to bring up a tunnel. It is notable that although each router in the topology has an interface in 172.16.0.0/24, this network is not known by R5 (my ‘Internet’ router). Instead, NHRP maps interfaces in this subnets to real interfaces in ‘globally routable’ subnets (inverted commas as I’ve used RFC1918 address in this example).

There a four commands needed to configure NHRP, although only two are used on the hub.

The ‘ip nhrp network-id’ must be the same across the DMVPN instance; it is quite possible and sometimes desirable to run multiple concurrent DMVPN overlays. For this lab, one is enough.

The ‘ip nhrp map multicast’ command sets up a mapping so that our NBMA network can process multicast and unknown unicast packets. We then have two options:

R1(config-if)#ip nhrp map multicast ?
  A.B.C.D  IP NBMA address
  dynamic  Dynamically learn destinations from client registrations on hub

On a spoke, we enter the IP NBMA address of our hub so that broadcast and unknown unicast traffic is sent there. We need to do that so that we can run an IGP over our mGRE network later. In my example that would be 192.168.15.1. On our hub we use the dynamic keyword.

Spokes need an additional map entry to link the hub’s tunnel address to its physical address, ‘ip nhrp map 172.16.0.1 192.168.15.1’ in our example.

Finally, for each spoke we need to define its NHS using the ‘ip nhrp nhs 172.16.0.1’ commend

Bringing it all together, to define our hub we add two commands to Tun0 on R1:

interface Tunnel0
 ip nhrp map multicast dynamic
 ip nhrp network-id 1

To define a spoke, we add four commends to Tun0 on R2, R3 and R4:

interface Tunnel0
 ip nhrp map 172.16.0.1 192.168.15.1
 ip nhrp map multicast 192.168.15.1
 ip nhrp network-id 1
 ip nhrp nhs 172.16.0.1

Dynamic MVPN

Like many things, DMVPN relies on a sensible routing table on each router. Here is the logical topology:

DMVPN

 

We need each router to have routes to the other routers’ client networks. EIGRP lends itself well to this. We configure it to match 172.16.0.0/24 on each router, as well as the local client networks. We’ll make the client interfaces passive to stop our routers peering with anything downstream. Finally we’ll need to tweak the defaults on the hub so that:

* split-horizon doesn’t prevent us advertising subnets back out to the other spokes
* we advertise the next hop IP as that of the originating router

Here is R1’s eigrp config, the only change on the other routers is to match their local client networks as they all use the same local physical interfaces in my lab.

router eigrp 1
 passive-interface FastEthernet0/1
 network 10.10.10.0 0.0.0.255
 network 172.16.0.0 0.0.0.255
 no auto-summary

On R1 we also add:

interface Tunnel0
! ip address 172.16.0.1 255.255.255.0
 no ip next-hop-self eigrp 1
 no ip split-horizon eigrp 1

Verification

First, let’s check the routing – each hub and spoke should know about all the other client networks via the mGRE tunnel:

R1#show ip route eigrp 
     20.0.0.0/24 is subnetted, 1 subnets
D       20.20.20.0 [90/297270016] via 172.16.0.2, 02:58:14, Tunnel0
     40.0.0.0/30 is subnetted, 1 subnets
D       40.40.40.252 [90/297270016] via 172.16.0.4, 02:58:16, Tunnel0
     30.0.0.0/24 is subnetted, 1 subnets
D       30.30.30.0 [90/297270016] via 172.16.0.3, 02:58:16, Tunnel0

R4#show ip route eigrp 
     20.0.0.0/24 is subnetted, 1 subnets
D       20.20.20.0 [90/310070016] via 172.16.0.2, 02:58:06, Tunnel0
     10.0.0.0/24 is subnetted, 1 subnets
D       10.10.10.0 [90/297270016] via 172.16.0.1, 02:58:06, Tunnel0
     30.0.0.0/24 is subnetted, 1 subnets
D       30.30.30.0 [90/310070016] via 172.16.0.3, 02:58:06, Tunnel0

On the hub, we should see our three peers:

R1#show dmvpn
Legend: Attrb --> S - Static, D - Dynamic, I - Incompletea
	N - NATed, L - Local, X - No Socket
	# Ent --> Number of NHRP entries with same NBMA peer

Tunnel0, Type:Hub, NHRP Peers:3, 
 # Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
     1    192.168.25.1      172.16.0.2    UP    never D    
     1    192.168.35.1      172.16.0.3    UP    never D    
     1    192.168.45.1      172.16.0.4    UP    never D

On each spoke, we’ll just see the hub:

R4#show dmvpn
Legend: Attrb --> S - Static, D - Dynamic, I - Incompletea
	N - NATed, L - Local, X - No Socket
	# Ent --> Number of NHRP entries with same NBMA peer

Tunnel0, Type:Spoke, NHRP Peers:1, 
 # Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
     1    192.168.15.1      172.16.0.1    UP 02:56:54 S

Now, let’s try and ping 30.30.30.254 (R3’s client interface) from 40.40.40.254 (R4’s client interface):

R4#ping 30.30.30.254 source fa 0/1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 30.30.30.254, timeout is 2 seconds:
Packet sent with a source address of 40.40.40.254 
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 24/34/64 ms

Great, it worked. You can see that the dynamic tunnel has come up:

R4#show dmvpn
Legend: Attrb --> S - Static, D - Dynamic, I - Incompletea
	N - NATed, L - Local, X - No Socket
	# Ent --> Number of NHRP entries with same NBMA peer

Tunnel0, Type:Spoke, NHRP Peers:2, 
 # Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
     1    192.168.15.1      172.16.0.1    UP 03:00:22 S    
     1    192.168.35.1      172.16.0.3    UP    never D

I’ve uploaded a packet capture taken during the above test (on the R4-R5 link) to cloudshark.

Click on that link and you’ll see the capture filtered to NHRP traffic only. Note that packets 16 and 19 flow via the hub, but 20 and 21 are spoke-to-spoke. If you open up the NHRP Mandatory Part in packet 16 you’ll see that the source and destination protocol addresses are the mGRE tunnel endpoints and the source NBMA address is the physical interface of the router.

Screen Shot 2014-02-24 at 13.17.08

For packet 19 the protocol addresses are reversed (as expected for a reply) but now the source NBMA address is that of R3. Looking at the IP header confirms that this packet came from R1.Screen Shot 2014-02-24 at 13.32.05

In packet 20, NHRP provides R4 with R3’s physical address:

 

Screen Shot 2014-02-24 at 13.35.39

In packet 21, R4 provides R3 with its physical address.Screen Shot 2014-02-24 at 13.36.12

In packets 20 and 21, open the client information entry section to see that NHRP has returned 192.168.35.1, the physical address of R3 and endpoint for our dynamic, spoke-to-spoke tunnel.

Finally, let’s have a look at the pings. The first thing I should point out is that we are tunnelling the traffic, so there are two IP headers. In real life I would drop the IP MTU on each Tu interface to allow for the GRE encapsulation and extra IP header, plus any crypto.

Packet 15’s outer IP headers have source and destination physical addresses of R4 and R1. The inner headers are the local client subnets foreach router. Screen Shot 2014-02-24 at 13.47.35This continues until packet 22 (right after NHRP resolved the true address). From this point, the traffic is direct. Note that in this case, the request went via R1, but the reply came direct as the spoke-to-spoke tunnel had come up.

Screen Shot 2014-02-24 at 13.48.41

 Summary

In this post I’ve described and implemented a simple (crypto-free) DMVPN topology. I may expand on this in future to add encryption.

Update: I’ve covered how to add encryption here.

Site-to-site IPSec VPN through NAT

This post follows on from the first in this series and looks at how to modify the config if there is NAT along the way as well as reviewing a couple of the verification commands.

I’ve attached the full configs here.

Network Diagram

IPSec with NAT

Premise

A branch office with an ADSL connection would like to access corporate and local resources without running a local client on office machines. Split tunnelling is not required, all traffic must be routed back up to the corporate HQ. Only one static IP has been provided by the ADSL ISP.

Config

We’ll need to port forward UDP 500 (IKE) so that our corporate ASA can connect to the branch ASA. On the ADSL router we use the following NAT rules:

ip nat inside source list LAN interface FastEthernet0/0 overload
ip nat inside source static udp 192.168.1.1 500 interface FastEthernet0/0 500

You’ll see I’ve moved the B-End IP of the IPSec tunnel to the ADSL router so the A-End config doesn’t change. All I need to do is renumber the blue linknet to my chosen RFC1918 subnet of 192.168.1.0/24 and give my ASA a new default route matching the ADSL routers interface and all is well.

Testing

One thing which has bitten me in the past is that an IPSec tunnel won’t come up until you send some traffic down it. Since I’m doing this in GNS3 and VPCs, I’ll open up my crypto-map to allow ICMP so that I can bring up the tunnel with some pings.

A-END

access-list OUTSIDE_CRYPTOMAP_10 extended permit icmp any 10.1.0.0 255.255.255.0

B-END

access-list OUTSIDE_CRYPTOMAP_10 extended permit icmp 10.1.0.0 255.255.225.0 any

I also brought up a loopback with ip 8.8.8.8 on R1, to give my host on the otherside of the VPN something to ping. Finally I should say that I’m running OSPF on the two routers either side of the ‘public internet’ cloud, in order that the IPSec Peers have a route to either other.

First I had a look to see if my IPSec SA had come up:

A# show crypto ipsec sa

There are no ipsec sas

Hmm.

VPCS[1]> ping 8.8.8.8
8.8.8.8 icmp_seq=1 timeout
8.8.8.8 icmp_seq=2 ttl=255 time=60.482 ms
8.8.8.8 icmp_seq=3 ttl=255 time=53.498 ms
8.8.8.8 icmp_seq=4 ttl=255 time=55.094 ms
8.8.8.8 icmp_seq=5 ttl=255 time=47.397 ms

IPSec SA Verification

After bringing up the tunnel by pinging 8.8.8.8 from a host behind the B-END ASA, I was able toverify it (apart from the ICMP Echo Replies I got) as follows:

A# show crypto ipsec sa
interface: outside
    Crypto map tag: outside_map, seq num: 10, local addr: 192.0.2.6

      access-list OUTSIDE_CRYPTOMAP_10 extended permit ip any 10.1.0.0 255.255.255.0 
      local ident (addr/mask/prot/port): (0.0.0.0/0.0.0.0/0/0)
      remote ident (addr/mask/prot/port): (10.1.0.0/255.255.255.0/0/0)
      current_peer: 192.0.2.129

      #pkts encaps: 4, #pkts encrypt: 4, #pkts digest: 4
      #pkts decaps: 4, #pkts decrypt: 4, #pkts verify: 4
      #pkts compressed: 0, #pkts decompressed: 0
      #pkts not compressed: 4, #pkts comp failed: 0, #pkts decomp failed: 0
      #pre-frag successes: 0, #pre-frag failures: 0, #fragments created: 0
      #PMTUs sent: 0, #PMTUs rcvd: 0, #decapsulated frgs needing reassembly: 0
      #send errors: 0, #recv errors: 0

      local crypto endpt.: 192.0.2.6/500, remote crypto endpt.: 192.0.2.129/500
      path mtu 1500, ipsec overhead 74, media mtu 1500
      current outbound spi: C7F1AEC5
      current inbound spi : 9DE630E8

    inbound esp sas:
      spi: 0x9DE630E8 (2649108712)
         transform: esp-aes-256 esp-sha-hmac no compression 
         in use settings ={L2L, Tunnel, }
         slot: 0, conn_id: 45056, crypto-map: outside_map
         sa timing: remaining key lifetime (kB/sec): (4055039/28776)
         IV size: 16 bytes
         replay detection support: Y
         Anti replay bitmap: 
          0x00000000 0x0000001F
    outbound esp sas:
      spi: 0xC7F1AEC5 (3354504901)
         transform: esp-aes-256 esp-sha-hmac no compression 
         in use settings ={L2L, Tunnel, }
         slot: 0, conn_id: 45056, crypto-map: outside_map
         sa timing: remaining key lifetime (kB/sec): (4193279/28776)
         IV size: 16 bytes
         replay detection support: Y
         Anti replay bitmap: 
          0x00000000 0x00000001

A#

Here is how the B-END sees things:

B# show crypto ipsec sa                  
interface: outside
    Crypto map tag: outside_map, seq num: 10, local addr: 192.168.1.1

      access-list OUTSIDE_CRYPTOMAP_10 extended permit ip 10.1.0.0 255.255.255.0 any 
      local ident (addr/mask/prot/port): (10.1.0.0/255.255.255.0/0/0)
      remote ident (addr/mask/prot/port): (0.0.0.0/0.0.0.0/0/0)
      current_peer: 192.0.2.6

      #pkts encaps: 4, #pkts encrypt: 4, #pkts digest: 4
      #pkts decaps: 4, #pkts decrypt: 4, #pkts verify: 4
      #pkts compressed: 0, #pkts decompressed: 0
      #pkts not compressed: 4, #pkts comp failed: 0, #pkts decomp failed: 0
      #pre-frag successes: 0, #pre-frag failures: 0, #fragments created: 0
      #PMTUs sent: 0, #PMTUs rcvd: 0, #decapsulated frgs needing reassembly: 0
      #send errors: 0, #recv errors: 0

      local crypto endpt.: 192.168.1.1/500, remote crypto endpt.: 192.0.2.6/500
      path mtu 1500, ipsec overhead 74, media mtu 1500
      current outbound spi: 8E827434
      current inbound spi : 8471E0F8

    inbound esp sas:
      spi: 0x8471E0F8 (2222055672)
         transform: esp-aes-256 esp-sha-hmac no compression 
         in use settings ={L2L, Tunnel, }
         slot: 0, conn_id: 4096, crypto-map: outside_map
         sa timing: remaining key lifetime (kB/sec): (4147198/27959)
         IV size: 16 bytes
         replay detection support: Y
         Anti replay bitmap: 
          0x00000000 0x0007FFFF
    outbound esp sas:
      spi: 0x8E827434 (2390914100)
         transform: esp-aes-256 esp-sha-hmac no compression 
         in use settings ={L2L, Tunnel, }
         slot: 0, conn_id: 4096, crypto-map: outside_map
         sa timing: remaining key lifetime (kB/sec): (4285438/27959)
         IV size: 16 bytes
         replay detection support: Y
         Anti replay bitmap: 
          0x00000000 0x00000001

B#

You can also check out the IKEV2 SAs like this:

A# show crypto ikev2 sa

IKEv2 SAs:

Session-id:9, Status:UP-ACTIVE, IKE count:1, CHILD count:1

Tunnel-id                 Local                Remote     Status         Role
 89722291         192.0.2.6/500       192.0.2.129/500      READY    RESPONDER
      Encr: AES-CBC, keysize: 256, Hash: SHA96, DH Grp:2, Auth sign: PSK, Auth verify: PSK 
      Life/Active Time: 86400/3606 sec
Child sa: local selector  0.0.0.0/0 - 255.255.255.255/65535
          remote selector 10.1.0.0/0 - 10.1.0.255/65535
          ESP spi in/out: 0xa8d47b04/0xfddbc217

 

B# show crypto ikev2 sa

IKEv2 SAs:

Session-id:9, Status:UP-ACTIVE, IKE count:1, CHILD count:1

Tunnel-id                 Local                Remote     Status         Role
 77759211       192.168.1.1/500         192.0.2.6/500      READY    INITIATOR
      Encr: AES-CBC, keysize: 256, Hash: SHA96, DH Grp:2, Auth sign: PSK, Auth verify: PSK 
      Life/Active Time: 86400/3526 sec
Child sa: local selector  10.1.0.0/0 - 10.1.0.255/65535
          remote selector 0.0.0.0/0 - 255.255.255.255/65535
          ESP spi in/out: 0xfddbc217/0xa8d47b04

NAT-T

By default, an ASA will encapsulate both IKEV2 negotiation and the IPSec encrypted packets in UDP 500. If you want to use NAT-T and encapsulate the IPSec packets in UDP 4500 then oort forward UDP 4500 on the NAT router and enable NAT-T on the each ASA:

NATRouter(config)# ip nat inside source static udp 192.168.1.1 4500 interface FastEthernet0/0 4500
ASA(config)# crypto isakmp nat-traversal

ASA 8.4 on Mac OSX 10.8

Like many before me I wanted to emulate an ASA in my GNS3 environment. I am a Mac users and found this to be tricky so will post the steps I took to get it working here. Having done this, I was able to add a couple of ASAs to a topology and fire them up. I should add that they take a while to boot up! You’ll also need to add licences to the ASA, although that isn’t OSX specific.

QEMU

Unlike other versions, the OSX GNS3 package does not come with QEMU bundled. Apparently this will change in the next release but for now, we need to download and install the OSX build. This is pretty easy as the package comes with an install script, but I found I did need to fix the file permissions in /usr/local/bin.

First, download the QEMU built for OSX. Then unpack the tarball (tar zxvf QEMU-0.11.0-GNS3-OSX.tar). Finally, run the script:

./Qinstall
Making Directrories - if directories exist you may see errors, which can safely be ignored.
Please supply elevated credentials.
mkdir: /usr/local/: File exists
mkdir: /usr/local/bin/: File exists
Making /usr/local/bin directory...
Making /usr/local/share directory...
Copying files to their proper locations...
All done. Have fun with your JunOS patched version of QEMU!

I already had /usr/local/bin as it was created when I installed Wireshark. I found that the perms were 600 on /usr/local/bin and needed to adjust these so that my user could run them:

sudo chmod 755 /usr/local/bin

The files which the script placed in that directory were:

qemu
qemu-img
qemu-system-i386

We point our paths to the bottom two as shown:

QEMU Settings

Splitting the ASA binary

In order for QEMU to be able to boot the ASA software, we need to break it into two files:

asa842-vmlinuz
asa842-initrd.gz

Fortunately this is made infinitely simpler with the repack script, available here. I downloaded ASA asa842-k8.bin from the Cisco website. I did try some newer releases but found the script doesn’t accept them. For now I’m happy with 8.4. As I have access to a linux box, I ran the script on there. Version 4 of the script has a few dependancies (mkisofs/syslinux/cdrtools) as it produces an ISO among other things (which I didn’t need personally). I installed them anyway just to be sure the script would run cleanly.

[how@fantastic ~]$ ./repack.v4.sh asa845-smp-k8.bin
Repack script version: 4
which: no mkisofs in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/how/bin)
no syslinux/cdrtools - ISO creation skipped
Version is not supported!

So, I su’d up (thanks Barney) and ran yum install mkisofs. I also switched to asas842-k8.bin:

[root@fantastic ~]# ./repack.v4.sh asa842-k8.bin
Repack script version: 4
no syslinux/cdrtools - ISO creation skipped

Okay, one yum install syslinux later..

[root@fantastic ~]# ./repack.v4.sh asa842-k8.bin
Repack script version: 4
Detected syslinux/cdrtools - ISO will be created

This created the following files:

asa842-vmlinuz
asa842-initrd.gz
asa842-initrd-original
asa.iso

We are interested in the first two and configure the ASA Specific Settings in GNS3 as follows:

ASA GNS3 Settings

Final GNS3 configuration

I read that the Kernel settings are sometimes distributed with the image but I couldn’t find them. I got these from number of sources. If anyone can tell me how to calculate them I’ll be most grateful. The screenshot above shows the config, but for you to cut and paste.

Qemu Options:

-vnc none -vga none -m 1024 -icount auto -hdachs 980,16,32

Kernel:

-append ide_generic.probe_mask=0x01 ide_core.chs=0.0:980,16,32 auto nousb console=ttyS0,9600 bigphysarea=65536

This all done, I tested the settings. Don’t be concerned by the pemu error, it wasn’t ported to OSX apparently and is not needed for ASA emulation anyway.

test_settings

Links

Here are some of the resources I used to get this working:

http://www.brainbump.net/GNS3-How-to-emulate-ASA-8.4-2-under-QEMU
http://blog.ciscoinferno.net/gns3-and-cisco-asa-8-4-part-1
http://www.network-blog.com/ittech/post/2012/02/06/Configure-Cisco-ASA-firewall-version-84-on-GNS3.aspx

Site-to-site IPSec VPN

Introduction

In this post I will walkthrough the configuration of a site-to-site IPSec VPN tunnel using a pair of ASAs. I’ll use the terms eastbound and westbound to describe traffic flowing across the tunnel, relative to the diagram below.

Network Diagram

There is an error on this diagram, the tunnel (in blue) on the left should read 192.0.2.60 -> 192.0.2.129. I’ll fix this when I get the chance.

IPSecVPN

 

Tunnel Logic

You may think of the tunnel as a logical version of a dedicated point-to-point serial connection between the two ASAs. Since our logical point-to-point link is traversing the Internet we use IPSec encryption to prevent snooping. Each end of the tunnel is on a different subnet (obviously).

Routing

A-END (HOME BASE)

Here we only have transit networks and we use static routes which scales well enough for this simple point-to-point link.

  • For westbound traffic We have a default route to send all decapsulated tunnelled traffic received on the ASA out via the orange linknet to R1.
  • For eastbound traffic, R1 has a static route for 10.1.0.0/24 (the B-End client subnet) pointing east to the ASA. The ASA will encapsulate traffic with this destination into the IPSec tunnel.
  • Finally there is an eastbound default route for non-tunnelled traffic to reach any IPSec peers, remote management of the ASA and any other services.

B-End (Remote Site)

There is a default route on the B-End ASA sending everything via its westbound interface (outside). An ACL ensures everything from the local subnet (10.1.0.0/24) is encapsulated in the the tunnel. Eastbound return traffic will be de-encapsulated and then routed internally by the ASA so no ACL is needed.

IPSec

A-END Config

! Phase 2 - ipsec tunnel for the data
crypto ipsec ikev2 ipsec-proposal MY_PROPOSAL
 protocol esp encryption aes-256
 protocol esp integrity sha-1
! Phase 1 - iskmp tunnel to encrypt initial ASA chatter
crypto ikev2 policy 1
 encryption aes-256
 integrity sha
 group 5
 prf sha
 lifetime seconds 86400
! light up crypto on the outside interface
crypto ikev2 enable outside
! Define the B-END of the tunnel and configure PSK
tunnel-group 192.0.2.129 type ipsec-l2l
tunnel-group 192.0.2.129 ipsec-attributes
 ikev2 remote-authentication pre-shared-key B_END_KEY
 ikev2 local-authentication pre-shared-key A_END_KEY
! What traffic do we wish to send down the ipsec tunnel?
access-list OUTSIDE_CRYPTOMAP_10 remark ACL to encrypt traffic from anywhere to B-END
access-list OUTSIDE_CRYPTOMAP_10 extended permit ip any 10.1.0.0 255.255.255.0

! Bring it all together and enable on the outside interface
crypto map outside_map 10 match address OUTSIDE_CRYPTOMAP_10
crypto map outside_map 10 set peer 192.0.2.129
crypto map outside_map 10 set ikev2 ipsec-proposal MY_PROPOSAL
crypto map outside_map interface outside

! Send tunneled traffic to the inside interface to be routed on the enterprise:
route inside 0.0.0.0 0.0.0.0 192.0.2.1 tunneled

B-END Config

!
crypto ipsec ikev2 ipsec-proposal MY_PROPOSAL
 protocol esp encryption aes-256
 protocol esp integrity sha-1
!
crypto ikev2 policy 1
 encryption aes-256
 integrity sha
 group 5
 prf sha
 lifetime seconds 86400
crypto ikev2 enable outside
!
tunnel-group 192.0.2.6 type ipsec-l2l
tunnel-group 192.0.2.6 ipsec-attributes
 ikev2 remote-authentication pre-shared-key A_END_KEY
 ikev2 local-authentication pre-shared-key B_END_KEY
!
object-group network clients
 network-object 10.1.0.0 255.255.255.0
access-list clients-out extended permit ip object-group clients any 
access-list clients-out extended permit icmp any any 
access-list OUTSIDE_CRYPTOMAP_10 remark ACL to encrypt traffic from local net to anywhere
access-list OUTSIDE_CRYPTOMAP_10 extended permit ip 10.1.0.0 255.255.255.0 any 
!
access-group clients-out in interface inside
!
crypto map outside_map 10 match address OUTSIDE_CRYPTOMAP_10
crypto map outside_map 10 set peer 192.0.2.6
crypto map outside_map 10 set ikev2 ipsec-proposal MY_PROPOSAL
crypto map outside_map interface outside
!

Interfaces

A-END Config

interface GigabitEthernet0/0
 nameif inside
 security-level 100
 ip address 192.0.2.2 255.255.255.252 
!
interface GigabitEthernet1/0
 nameif outside
 security-level 0
 ip address 192.0.2.6 255.255.255.252

B-END Config

interface GigabitEthernet0/0
 nameif inside
 security-level 100
 ip address 10.1.0.254 255.255.255.0 
!
interface GigabitEthernet1/0
 nameif outside
 security-level 0
 ip address 192.0.2.129 255.255.255.252

Logging

Since the B-End is remote, it would be preferable to log over TCP as it would give more certainty as to the source of the packets. However, this can overload the ASA so we are stuck with UDP. We log more information at the A-End end as the traffic doesn’t get encrypted so is less of a burden.

A-END

!
logging timestamp
logging trap notifications
logging host outside <LOGGING_HOST>
!

B-END

You can enable buffered logging as needed.

!
logging enable
logging timestamp
logging trap warnings
logging host outside <LOGGING_HOST>
!

Routing Config

For simplicity this example uses static routes. R1 has a static route to send the client network via the A-End ASA:

ip route 10.1.0.0 255.255.255.0 192.0.2.2

The A-END ASA has a default route eastbound, so that any IPSec peer can be configured

route outside 0.0.0.0 0.0.0.0 192.0.2.5 1

The A-END ASA also needs to be able to route IPSec when it pops out of the tunnel, with any destination address:

route inside 0.0.0.0 0.0.0.0 192.0.2.1 tunneled

The B-End ASA has a static route to send everything (non-tunnel) via its outside linknet. It doesn’t need a tunneled route as the only possible destination is the client LAN 10.1.0.0/24.

route outside 0.0.0.0 0.0.0.0 192.0.2.130 1

Through NAT?

If you want to read about setting up an IPSec VPN through NAT, see this follow up post.

Auto QoS and ASIC / Port mappings on a 6500

Today I came across an interesting qwerk which I felt worth sharing. If you enable auto-qos on a 6500 port, the QoS features will be applied to every port which shares the relevant ASIC. Exactly how this plays out will depend on the architecture of the switch, we need to take a closer look. In this excellent post James Ventre says:

“In the 6500 platform, [the ASIC port mapping is] easily displayed with ‘show interface capabilities’… The portion we’re interested in is labeled ‘Ports-in-ASIC’.”

In my case, I have a WS-X6748-SFP in slot 3 of my 6500 and I’m interested in port 1:

my-6500# show interfaces gig 3/1 capabilities | i ASIC
 Ports-in-ASIC (Sub-port ASIC) : 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47 (1,3,5,7,9,11,13,15,17,19,21,23)

According to this post, there are three levels of ASIC on a 6748 line card:

  1. Two JANUS Bus ASICs
  2. Two SSA Fabric ASICs
  3. Four ROHINI Port ASICs
The section in parenthesis refers to the ASIC we are interested in. The backplane on a sup-720 offers 40G per line card. My 6748 has 48 ports, managed by four ROHINI or Sub-port ASICs, each connected at 10Gbps (as an aside, this gives an oversubscription ratio of 1.2:1 as 12 Gigabit port has access a 10G connection to the backplane).
When I enabled auto-qos on interface Gig 3/1, I saw the following on the remaining (default config) ports sharing its ROHINI ASIC:
interface GigabitEthernet3/n ! where n is odd and < 24
 no ip address
 shutdown
 wrr-queue cos-map 2 1 1 2
 wrr-queue cos-map 3 5 3 4
 wrr-queue cos-map 3 7 6 7
 rcv-queue cos-map 1 2 1
 rcv-queue cos-map 1 3 2
 rcv-queue cos-map 1 4 3
 rcv-queue cos-map 1 5 4
 rcv-queue cos-map 1 6 5
 rcv-queue cos-map 1 7 6
 rcv-queue cos-map 1 8 7

So there you have it – no need to panic if a load of config appears on your box from a single auto-qos command. However, do be aware of your switch architecture before trying to squeeze everything you can out of an old line card.

Troubleshooting high CPU on a Catalyst 6500

Thanks to Nicola Pezzi for this tip.

If the CPU is pegged on a 6500 due to IP packets being punted up to it [1], there is a nice trick to see what packets are doing the damage in the form of the debug netdr command. To prevent us from overwhelming our box, we can dump 4096 packets (configurable) from the debug to a buffer like this:

6500# debug netdr capture

Here are the options:

6500#debug netdr capture ?
  and-filter              (3) Apply filters in an and function: all must match
  continuous              (1) Capture packets continuously: cyclic overwrite
  destination-ip-address  Capture all packets matching ip dst address
  dmac                    Capture packets matching destination mac
  dstindex                (7) Capture all packets matching destination index
  ethertype               (8) Capture all packets matching ethertype
  interface               (4) Capture packets related to this interface
  or-filter               (3) Apply filters in an or function: only one must match
  rx                      (2) Capture incoming packets only
  smac                    Capture packets matching source mac
  source-ip-address       (9) Capture all packets matching ip src address
  srcindex                (6) Capture all packets matching source index
  tx                      (2) Capture outgoing packets only
  vlan                    (5) Capture packets matching this vlan number

Let’s grab packets transmitted from VLAN 20 which hit the CPU:

 debug netdr capture tx vlan 20

The buffer can be seen with:

show netdr captured-packets

Here is an example capture:

A total of 4096 packets have been captured
The capture buffer wrapped 0 times
Total capture capacity: 4096 packets

------- dump of incoming inband packet -------

interface Vl20, routine mistral_process_rx_packet_inlin, timestamp 15:38:33
dbus info: src_vlan 0x14(20), src_indx 0x342(834), len 0x47E(1150)
bpdu 0, index_dir 1, flood 0, dont_lrn 0, dest_indx 0x380(896)
08020000 00143800 03460004 7E000000 00F70438 10000008 00000000 03804A64
mistral hdr: req_token 0x0(0), src_index 0x342(834), rx_offset 0x76(118)
requeue 0, obl_pkt 0, vlan 0x14(20)
destmac 00.00.0C.07.AC.00, srcmac 10.11.14.16.15.10, protocol 0800
protocol ip: version 0x04, hlen 0x05, tos 0x00, totlen 1132, identifier 7234
df 0, mf 1, fo 0, ttl 254, src 192.0.2.1, dst 172.16.0.31, proto 103

You can customise the capture as you wish. In this example host 192.0.2.1 is sending PIM (IP Protocol 103) traffic to router 172.16.0.31, which is processing the traffic in software. In this case CGMP had been enabled on an on 3500 switch causing us to fall foul of the issues documented here.

This is a simple, safe and powerful tool from troubleshooting when a chassis starts doing things in software.

[1] You’ll see a high percentage under ‘IP input’ when you run ‘show process cpu sorted’.

ASA source port

I was troubleshooting an issue today where an ASA 5505 had been configured as a replacement for an old PIX, but authentication with an external RADIUS server was failing.

A number of stateful firewalls sit between the outside interface of the ASA in our enterprise network and the public Internet. However the outbound rules permitted the appropriate traffic.

First I set up a capture to verify things from the perspective of the ASA:

! Version 8.X
!
access-list CAPTURE permit ip any host 192.0.2.10
access-list CAPTURE permit ip host 192.0.2.10 any
!
! Version 9.X has separate ACEs for ipv4 and ipv6, if you enter the above you'll get:
! ERROR: Capture doesn't support access-list containing mixed policies
! so, change the ACL to look like this:
!
access-list CAPTURE permit ip any4 host 192.0.2.10
access-list CAPTURE permit ip host 192.0.2.10 any4
!
! Now enable the capture
capture RADIUS access-list CAPTURE interface outside

 

Here is the output. The ASA actually has a public IP.

ASA# show capture RADIUS

54 packets captured

1: 12:16:06.883545 802.1Q vlan#8 P0 172.16.1.1.1025 > 192.0.2.10.1645: udp 158
2: 12:16:07.975794 802.1Q vlan#8 P0 172.16.1.1.1025 > 192.0.2.10.1645: udp 158
3: 12:16:09.075771 802.1Q vlan#8 P0 172.16.1.1.1025 > 192.0.2.10.1645: udp 158
…..

You can also view the capture in a web browser like this: https:///capture/RADIUS. Or you can download the file for wireshark analysis by going to an URL like this: https:///capture/RADIUS/pcap. Clearly you’ll need to substitute ‘RADIUS’ for whatever you called your capture.

Also, if you’re captures are big, you can used the circular-buffer and buffer command to make the buffer eat its own tail and increase the size (up to around 32M).

Finally, use the headers-only keyword if you aren’t interested in the content of the packet.

Okay, so we weren’t getting a response. It was at this point that I checked the external interfaces on our Internet facing routers and found an inbound ACL which did not permit traffic to UDP 1025. One minor, conservative adjustment to the ACLs later and we were in business.

Summary

I’ve come across the ASA source port issue before. Blocking TCP/UDP 1024-5 is a common policy as there were well known trojans which used them. Something to look out for when migrating to an ASA, or any system which uses those ports as a source. I see that the question has already been asked, but if anyone knows how to change the source port an ASA uses for radius requests, do let me know!

IPSec Part 1: Glossary

This first post is a brief summary of the concepts, protocols and relationships involved in IPSec. I’ll look at how to make use of it in a future post.

IPSec is an IETF open standard which was designed as part of IPv6 and backported to IPv4. It can be used for both remote access and site-to-site VPNs and operates at L3 of the OSI model.

Glossary

  • IKE – Internet Key Exchange: used to negotiate and establish VPN connections
  • ISAKMP – Internet Security Association & Key Management Protocol: Framework which provides IKE used to establish SAs.
    • ISAKMP Phase 1 (IKE Negotiation)
      • Negotiate ISAKMP SA
      • Creates secure two-way comms between VPN peers
      • Uses UDP 500 (sometimes blocked by service providers)
    • ISAKMP Phase 2
      • Protected by ISAKMP SA
      • Negotiate IPSec SA
      • Enables payload traffic between VPN peers to be encrypted
      • ISAKMP header is the only non-encrypted part, hence phase 1.
      • One per subnet, per direction. Summarisation can help!
  • IPSec Protocol Suite
    • Two protocols used to encapsulate tunnel data:
      • AH – Authentication Headers
        • L3, IP protocol number 51
        • Protects payload and immutable header fields.
      • ESP – Encapsulating Security Payloads
        • L3, IP Protocol number 50
        • Provides origin authenticity, integrity and confidentiality
    • SA – Security Association
      • the bundle of algorithms and parameters (such as keys) being used to encrypt and authenticate a particular flow in one direction.
  • Phase 1 Policy components:
  • Encryption Algorithms (to encrypt the traffic)
    • DES – Data Encryption Standard (64 bit)
    • 3DES – Triple DES (164 bit)
    • AES – Advanced Encryption Services (128 bit)
    • AES192 (192 bit)
    • AES256 (256 bit) (recommended)
  • Hashing Algorithms (for data integrity)
    • SHA – Secure Hash Algorithm (recommended)
    • MD5 – Message Digest Algorithm 5
  • IPSec Peer Authentication
    • PSK – Pre Shared Keys (small – medium enterprise)
    • PKI – Public Key Infrastructure (medium – large enterprise)
  • Phase 1 SA Establishment modes
    • Main – protects identity if PSKs are used (typically site-to-site)
    • Agressive – protects identity if PKI is used (typically remote-access). Has known vulnerabilities so don’t use this.
  • Phase 2
    • AKA Quick mode
    • Creates one IPSec SA per direction, each with a unique SPI
  • SPI – Security Parameter Index
    • Used with destination IP address to select protection applied to outgoing packet
    • Used by IPSec pass-thru to work around PAT issues
  • Initiator – outbound SA
  • Responder – inbound SA
  • Diffie-Hellman – protocol enabling hosts to authenticate each other’s PSKs without transmitting them.
  • IPSec Modes
    • Transport
      • Protects L4
    • Tunnel
      • Cisco default
      • Protects whole packet
      • Increases the packet size as additional IP Header is needed
      • Watch for MTU issues

A note on NAT

Since the IPSec protocol suite operates at L3, there is no L4 port information for NAT to munge leading to AH and ESP packets being dropped. IPSec pass-thru builds L4 information from the SPI. Another option is NAT Traversal (NAT-T), where VPN peers dynamically discover that NAT is happening between them and encapsulate the traffic in UDP 4500 if so.

 

6500 VSS

In this first post on VSS I’m just dumping my notes from a breakout session on VSS at Networkers back in January, mostly for my own reference.

Summary

  • Makes two switches look like one switch
  • although in theory a VSS domain could contain many switches – only 2 are allowed today
  • Requires a dedicated link between the switches called a VSL (Virtual Switch Link)
  • Note: virtual SWITCHING system, this isn’t a router technology.
  • One time conversion involving changes to rommon (or conf-reg?)
    • The switch will find the VSL config before parsing the startup-config file fully
  • Switches referred to as Switch1 and Switch2, nomenclature fixed at conversion
  • One config
  • Ports are renumbered like when you stack 3750s e.g. Te1/1/1 and Te2/4/4
  • Control Plane -> only one box active (the other supervisor has state STANDBY_HOT)
  • Data Plane -> both boxes active
  • VSS has a considerably longer boot time

Deployment considerations and best practices

  • Never ever just type reload (you will get a warning). Use redundancy reload peer | self or redundancy force-switchover. If you are on the console you’ll need to connect to the other sup. If you go ahead with the reload then both switches will reboot at the same time – probably something you never want to do in a redundant setup.
    • The console will be disabled on switch 2, but cable it up in case of failover (like a 6500 with dual sup)
  • Never ever use write erase. It will wipe the rommon var which sets VSS at startup. Use erase nvram instead.
  • NSF is off by default – switch this on. It replicates the RIB to the standby chassis and greatly speeds up failover as forwarding to non directly attached routes can continue
router ospf 1
 nsf
  • Etherchannel, CEF forwarding and L3 ECMP (Equal Cost Multipath) have both been modified to always favour local links.
    • In a DC the traffic isn’t very random so we may want a L4 EC hash algorithm
    • Sup720 has 3-bit RBH (result bundle has), Sup2T has 8-bit so the algorithm can be more even..
  • Use unique domain IDs for each VSS pair.  Unique across entire campus network.
    • some MAC addresses as well as the system-id are derived from this
    • h/w swap outs between domains could break things.
    • avoid issues with sup swaps with mac-address use-virtual. This will require a reboot so build it into the boiler plate config
    • Switch MAC addresses are taken from the active chassis but retained on failover
  • Use out of band mac sync: mac address-table synchronize
  • Always dual attach in and out of the VSS or you create a SPOF
    • VSL is there principally for virtualisation and will only be used for data if there isn’t a local path
  • If you understand dual sup SSO (Stateful SwitchOver) you can think of VSS as this, but with the redundant sup in its own chassis and with the line cards in the second chassis available to the active sup.
    • SSO EOBC (100M Ethernet Out of Band Channel) replaced by VSL
    • To be SSO adjacent (fully standby hot on second sup) requires certain conditions to be true
  • We still need to run STP in the background in case a loop is accidentally introduced
  • Mechanisms exist to prevent split brain
    • LMP (Link Management Protocol), a bit like UDLD for the VSL
    • RRP (Role Resolution Protocol), decide who is active (lowest MAC by default), never force a failover. This is what makes the boot time so slow.
  • VSL
    • the split brain state (active-active) is a disaster – duplicate MAC addresses, router IDs etc.
    • VSL is main defence against this so important for it to be as resilient as possible
    • VSL ports must be 10G
    • use at least one of the 10G port on the Supervisor card since this boots before the line cards
    • have a minimum of 2 x 10G links (can have upto 8)
    • use a 10G port on a line card (both 10G ports on the Supervisor share an ASIC)
      • line card 10G ports must be VSL capable (note: the X6704 is not capable)
    • VSL takes control and data traffic between the chassis
      • the bandwidth of the VSL should be at least equal to the uplink bandwidth of each individual switch
    • Don’t change the VSL hashing algorithm in production networks since you will cut off some live flows
    • something about the QoS queues being different on the Sup 10G ports if you also use the Sup 1G ports – check my notes and write something sensible

Sample Config

Conversion

This is a one time process which doesn’t need to be symaltaneously on each switch, but probably should be

! VSS Domain is globally significant 

switch virtual domain 100 
  switch 1
  exit
int po 1
 switch virtual link 1
 exit
int ra ten 1/5/4-5
 channel-group 1 mode on 
 exit
switch convert mode virtual

switch virtual domain 100 
  switch 2
  exit
int po 2
 switch virtual link 2 
int ra ten 2/5/4-5
 channel-group 2 mode on
switch convert mode virtual
  • This will reboot the switch and change config to tell the switch it is a VSS
  • The switch will pre-parse the config for the VSL info so chatter can commence – on boot you can see which is ACTIVE or STANDBY

Ponder this: the port channels need different numbers as this will be one logical switch at the end.

VSL

! switch 1
int Po 1 
 no switchport
 no ip address
 switch virtual link 1
 mls trust cos
 no mls qos channel-consistancy

! switch 2
int Po 2
 no switchport
 no ip address
 switch virtual link 2
 mls trust cos
 no mls qos channel-consistancy

Verification

show switch virtual redundancy
- which switch am I?
- is control plane active?
- fabric (data plane) will be ..
show switch virtual role - active switch always first 

VSL Failure recovery

There are three methods we can use more than one.

  1. Enhanced PAgP
  2. VSLP “Fast hello”
  3. IP-BFD (Bi-Directional Forwarding detection) (deprecated feature)

We are interested in the first two. We need to detect the failure, recover from it and then reload the previously active sup.

While in recovery mode, avoid config changes (don’t even type conf t). This marks the config as modified and will require manual intervention to bring the VSS back.

  • 1. Enhanced PAgP
    • been around the longest
    • only on 3750 (12.2(46)SE, 4500, 6500 (with min software release)
    • new TLV field in PAgP message with active switch ID
    • sub-second convergence
    • If they see two different switch-ids then feed them back up the port channel and trigger the process
  • 2. VSLP “Fast Hello”
    • Virtual Switch Link Protocol
    • dedicated L2 link between the two switches
    • on all the time
    • sub-second hello
    • can be 100M link, no sync, just there as a heartbeat mechanism

Reboots

To reload only one VSS member use one of these commands:
redundancy reload shelf <shelf-ID>
redundancy force-switchover (switch to standby and reload active)
redundancy reload peer (reload standby)

 Software upgrade considerations

  • With VSS, the 6500 can be synced across different s/w releases so you can reboot one at a time
  • a message translation mechanism exists but this is limited to compatible versions
  • You have some time with 50% bandwidth but *no outage*
  • If something is broken by the upgrade and we cannot connect, there is a rollback Timer (45 minutes by default)
    • need to run issu acceptversion within that time to stop the timer
    • if there is a problem use issu rejectversion to bring forward
    • no unique features are available until you do issu commitversion
    • this allows you trial the existing features and make sure nothing broke before upgrading the second sup
  • s/w compatibilty matrix on cisco.com
  • 15.X train is the only way to get EFSU

ISSU History lesson

  • ISSU available across platforms
  • It is hitless, except on the 6500
  • The 6500 can do ISSU in standalone (non-VSS) mode, but the line cards have to reload
    • ISSU was renamed EFSU on the 6500 because of the hit
    • same commands are used though
  • pre SXI ‘Fast Software Upgrade’ is all we had, which resulted in an outage
  • 12.2(33)SXI – brough in Enhanced FSU

6500 Transit ACLs

A while ago we found that our FWSMs were no longer up to the job. As it happens we were doing nothing more than stateless packet filtering with them so could replicate their functionality with access-lists. I noticed the hit count on the ACLs was rather low and came across this blog post, which explained what was going on rather well.

In summary:

show ip access-list NAME

will only show packets destined for the router, not those passing through it. To see transit traffic you need this command:

show tcam interface <INT> acl [in | out] ip

This makes sense given that ACLs are implemented in TCAM on the 6500 platform.

Verification

We have an interface which originates a default route into our campus, with an ACL in each direction applied to it. We permit ip any any at the end after filtering out the cruft we don’t want to see. Here is a comparison of the two commands for that ACL.

ROUTER#show ip access-lists TO_CAMPUS | i permit ip any
    230 permit ip any any (4475 matches)

Given that this ACL has around 30,000 users behind it and the counters had been cleared earlier on the day the command was run, I would expect number to be larger.

Now the tcam command:

ROUTER#show tcam interface vlan 80 acl out ip | i ip any any
    permit       ip any any (4141914 matches)

That is more realistic.