Cisco Wide Area Application Services system (WAAS/WAVE/WAE)

The Cisco WAAS system consists of a set of devices that work together to optimize TCP network traffic. There are two types of devices that run WAAS software:

Cisco Wide Area Application Engine (WAE)
Cisco Wide Area Virtualization Engine (WAVE)

WAVE appliances implement virtualization: virtual blades can run one or more virtualized OSes within the WAVE appliance. For example, WAVE-574 can run Microsoft Windows services such as Print Services, Active Directory Services, DNS, and DHCP services. The Cisco WAE-674 Wide Area Application Engine is the only branch-office appliance from the Cisco WAE platform that offers virtual blade capability. Other WAE models cannot virtualize an external OS.

Why a WAN optimizator

A WAN optimizator is not just a data compressor. It optimizes data transmission over a media with lower bandwidth and higher latency. A Cisco WAAS includes the following key elements:

Data Redundancy Elimination (DRE): it is a sort of data deduplication that uses disk and memory to store redundant data.
Persistent LZ Compression (PLZ): packets are compressed before they are sent over the WAN link.
Transport Flow Optimization (TFO): TFO is a series of TCP optimizations that enhance performance when a high latency media is used. Examples of TCP optimizations are:
- use large initial windows
- selective acknowledge
- window scaling
- large buffer
- advanced congestion avoidance algorithms

But under real cases how WAAS behave?

Now Cisco WAAS appliances are under the Data Center Business Unit, and the reason is simple: most of the TCP WAN traffic is Data Center related, and the Data Center traffic is the most optimizable.

Suppose two data centers replicate each other over a WAN line. Both data centers run VMware, and Oracle over NetApp storage:

Replication: NetApp Snapmirror is uncompressed;
SQL: Oracle archive transport (over 1521 TCP port, used also for querying the database) is uncompressed;
Other traffic (Web, CIFS, email, other file transfer types, …)

Above traffic is compressed with the following performances:

Replication Traffic (NetApp Snapmirror) is reduced by 83%: the Replication traffic running over the WAN is 17% of the original traffic!
SQL (Oracle archive transport) is reduced by 70%: the SQL traffic running over the WAN is 30% of the original traffic!
Teradata replication traffic using FTP is reduced by 75%!

During the tests, 2TB were entered into the WAE and only 580GB was run over the WAN line. This means a 73% reduction!

WAAS Auto discovery

In the following topology, only traffic between Site A and Site C must be optimized, traffic from and to Site B cannot be optimized because there is no WAE appliance there:

Let’s suppose a communication initiated by Site A to Site C:

Site A sends an SYN packet to Site C: the WAE in Site A sets the TCP option 0x21.
WAE in Site B receives the SYN packet and sees that this communication can be optimized: the TCP option 0x21 is removed and the SYN packet reaches Site C.
Site C sends an SYN/ACK packet back to Site A: the WAE in Site C shifts SYN SEQ by 2 billion.
WAE in Site A receives the SYN/ACK packet and recognizes that this connection can be fully optimized. The correct SYN SEQ is restored and the packet reaches Site A.
This particular TCP connection can now be optimized because there is an end-to-end WAE connection.

WAAS through Firewall (Cisco ASA)

If packets flow through a Firewall and unknown TCP options are removed (0x21 is an unknown TCP option), then WAAS Auto-Discovery mechanism will fail and traffic won’t be optimized. On the latest Cisco ASA a WAAS inspect rule must be activated:

policy-map global_policy
class inspection_default
inspect waas

Checkpoint Firewalls should not be a problem, other firewalls should be checked.

Inline

Cisco WAE/WAVE can be configured using an inline adapter or WCCP protocol. The inline adapter is the simplest method: just put the appliance between the WAN and the router, and all traffic flowing from and to the WAN will be optimized.

Mechanical bypass mode prevents the WAE Inline Adapter from becoming a single point of failure and allows traffic to continue to flow between the router and the client while it passes through an unresponsive WAE without being processed. Anyway, the traffic won’t be optimized if the WAE will fail. Serial inline cluster is designed for high-availability failover only:

A serial inline cluster is not designed to provide load balancing or a method for scaling traffic beyond the capacity of a single Cisco WAAS device.

My opinion (and experience with other vendors) is that sometimes bypass mode does not work properly, so a better topology could be the following:

In the example above if the WAE will fail and the bypass mode won’t work, the active router will lose all routing from the WAN and can lower the HSRP priority to become the standby router. Of course, the traffic won’t be optimized anymore.

WCCP

To scale a WAAS installation, WCCP must be used. WCCP is a Cisco-developed content-routing protocol that provides a mechanism to redirect traffic flows in real-time. It has built-in load balancing, scaling, fault tolerance, and service-assurance (fail-safe) mechanisms.

In the above topology:

the client sends traffic to the WAN, the router (a.k.a. WCCP Server) redirects the traffic to the WAE (a.k.a. WCCP Client);
the WAE optimizes traffic and sends it back to the router which forwards it to the WAN;
the returning optimized traffic is received by the router and redirected to the WAE;
the WAE restores the original traffic (uncompressed) and sends it back to the router which forwards the traffic to the client.

A WCCP service group defines a set of characteristics about what types of traffic are intercepted, as well as how the intercepted traffic should be handled. Service groups can be well-known (0-50) or dynamic. Well-known services also referred to as static services, have a fixed set of characteristics that are known by both the WCCP server (IOS) and the client. The characteristics of a dynamic service are communicated to the WCCP server (IOS) when the client joins the service group.

The forwarding method defines how traffic that is being redirected from the router to the WAE is transmitted across the network (1,3). Two options are available:

WCCP GRE: is the default forwarding method, encapsulates the intercepted packet in an IP GRE header with a source IP address of the WCCP server (IOS) and a destination IP address of the target WCCP client.
WCCP L2: only available on hardware-based platforms such as the Catalyst series switches, simply rewrites the destination MAC address of the intercepted packet to equal the MAC address of the target WCCP client. L2 forwarding requires that the WCCP server (IOS) is Layer 2 adjacent to the WCCP client.

The return method defines how the traffic returns from the WAE to the router. Three options are available:

Generic GRE or WCCP GRE: traffic returning from the WAE to the router is encapsulated using a GRE tunnel, with a destination IP address of the WCCP router-id and a source IP of the WAE itself. When the packet is received by the router, the GRE encapsulation is removed and the packet is forwarded. Generic GRE is for a hardware-based platform like Catalyst 6k, and WCCP GRE is for other platforms.
IP Forwarding: packets from the WAE to the router are redirected using the IP forwarding table, typically using the default gateway address.

Let’s choose two WCCP dynamic service numbers: one service will be used from the client to the WAN (61), and the other service will be used for return traffic from the WAN to the client (62).

Given the above topology, service 61 can be redirected to the WAE:

using the GigabitEthernet0/0 (ingress packets);
using the GigabitEthernet0/1 (egress packets).

Service 62 can be redirected to the WAE:

using the GigabitEthernet0/1 (ingress packets);
using the GigabitEthernet0/0 (egress packets);

The forwarding method should be WCCP GRE and the egress method can be both WCCP GRE or IP Forwarding (the second one will be used).

The router configuration is the following:

ip wccp 61
ip wccp 62
interface GigabitEthernet0/0
 ip wccp 61 redirect in
interface GigabitEthernet0/1
 ip wccp 62 redirect in
interface GigabitEthernet1/0
 ip wccp redirect exclude in

The interface connected to the WAE must be configured with the “exclude in” options; otherwise, packets will loop. Another equivalent configuration could be the following:

ip wccp 61
ip wccp 62
interface GigabitEthernet0/1
 ip wccp 61 redirect out
 ip wccp 62 redirect in
interface GigabitEthernet1/0
 ip wccp redirect exclude in

The redirect in option is less CPU consuming and it should be preferred over the redirect out option; so the first proposed configuration is better than the second one.

If the WAE is located in the same network as the clients, the egress method must be WCCP GRE or packets will loop:

A GRE Tunnel must be configured between the router and the WAE, and traffic flowing through it must be excluded by WCCP redirect:

ip wccp 61
ip wccp 62
interface GigabitEthernet0/0
 ip wccp 61 redirect in
interface GigabitEthernet1/0
 ip wccp 62 redirect in
interface Tunnel1
 ip unnumbered GigabitEthernet0/0
 ip wccp redirect exclude in
 tunnel source GigabitEthernet0/0
 tunnel destination <WAE IP>

This is not a general rule, but WCCP is simpler if you mind that:

an interface must be dedicated to the WAE: it can be physical or virtual (tunnel);
the interface to the WAE must be excluded by WCCP redirect;
use “redirect in” to redirect traffic from and to the WAE.

With IOS routers, the redirect method must be “WCCP GRE” and the Egress method can be IP Forwarding or WCCP GRE. WCCP GRE must be used if WAE is now in a dedicated VLAN: in this case, a GRE Tunnel must be configured in the router.

HA and scaling out using WCCP

WCCP allows routers to balance the traffic load between more WAE appliances:

When more WAE appliances (WCCP clients) are registered in the same service group, WCCP automatically distributes intercepted traffic across all of the WAE in the same service group. The mechanism that determines how intercepted traffic is distributed across the WCCP clients in the service group is called the assignment method. In WCCPv2, there are two available assignment methods: hash assignment and mask assignment. Mask should be preferred over Hash for better performance but it’s available on Cisco Catalyst switches only. Cisco IOS routers want Hash assignment.

With Generic/WCCP GRE egress methods are used. For the Catalyst 6500 platform to process GRE traffic in hardware, a single point-to-multipoint tunnel should be used.