Your Wi-Fi system uses a lot of access points (APs), and the tips we’ve
covered in my previous articles may seem overwhelming. So, let’s review a corporate wireless monitoring system that provides visibility into these problems and look at some tips.
1. Monitor client association failures
A high volume of client association failures indicates an underlying problem. The exact number depends on the total number of clients. In general, association failures should be a small percentage of all association events. We’ve seen networks in which there were over 30,000 association failures in one hour, which we find excessive. Let’s look at a few examples.
A large classroom or auditorium creates a situation in which many clients must register within a short time period. Will everyone get connected within a few minutes of entering the room? Can the authentication and DHCP infrastructure handle the registration volume, particularly when classes change? Will all the clients associate with the AP nearest the door, causing it to be overloaded and the remaining APs to be underutilized?
Another network had client authentication failures in part of the network. We found that the Wi-Fi QoS class handling the authentication packets didn’t have sufficient bandwidth, and packets were being dropped. You’ll need detailed client debug information to diagnose failures like this.
2. Check the number of clients per AP
A critical figure to monitor is the number of clients per AP. One of our clients had a 200-person auditorium to cover. The initial design looked ok, with seven APs scattered around the room. However, two APs were incorrectly wall-mounted, reducing their coverage. Two more APs were not connecting to their controller (see the AP Health tip below). The remaining four APs had to handle 200 students, each with two devices (phone and tablet or laptop). Each active AP had to handle 50-100 clients, depending on the exact number of devices in the room. The maximum number of clients per AP depends on a variety of factors like bandwidth requirements and number of radios, with common values of around 30 clients each. These APs were regularly reporting radio overload (see Bandwidth Requirements tip below).
You can configure a limit on the number of clients per AP, but there’s a story there too. We had a case in which an AP that serviced a hallway had a maximum client count configured. The AP was configured with a limit to guarantee a certain amount of bandwidth per client. However, in this case, the AP also provided connectivity to a busy part of the facility. Anyone who walked down the hallway lost connectivity whenever the AP’s maximum client count had been reached. The solution was to install another AP for the hallway.
Also, check for APs that are never or rarely used. It is quite possible that some APs can be re-deployed.
3. Monitor bandwidth requirements
The bandwidth requirements can also be exceeded. This becomes a problem in schools whenever the professor asks the class to watch streaming HD video on their mobile devices. Monitor the AP wired network interface utilization and set thresholds to provide alerts. You should also trigger alerts on diagnostic messages from APs, like Radio load threshold violation and Interface threshold violation (see AP Debug/Syslog data tip below).
4. Monitor AP Health
APs may not connect to a wireless controller due to simple configuration errors. This failure can be avoided by automating the configuration, followed by a validation step. We’ve seen networks where many APs were not connected, which obviously created coverage holes where those APs were located.
In addition to AP operational health, keep an inventory of the age of all the wireless infrastructure, both software and hardware. Your wireless management system should track both elements and facilitate updating software. End-of-life APs should be included in a hardware refresh plan. Part of this plan may be to roll out new, more capable APs to demanding locations and move the displaced APs to less-demanding locations. However, re-locating APs will increase labor costs, making it unattractive.
5. Eliminate or restrict 802.11b,g clients
If you are still supporting 802.11b clients at low speeds, they will consume a large portion of the radio transmit/receive time, which limits the availability for higher speed clients. Newer Wi-Fi standards (802.11n and later) should be used to take advantage of the 5GHz bands where there is more bandwidth and more equitable sharing of radio transmit/receive slots.
Identify the types of clients on the network (802.11a, b, g, n, ac, ax). Work to replace the older, slower clients. If you can’t replace them, you may be able to restrict them to specific locations where they don’t impact the rest of the network.
6. Take advantage of AP debug/syslog data
APs produce a lot of useful debug/syslog data. You can think of syslog as the network device’s mechanism to report problems. Make sure that syslog is being captured and that problems are identified. Modern log processing tools use machine learning to make it easy to identify important events with minimum effort.
7. Monitor DHCP pools
Clients must obtain an IP address while joining the wireless network, typically from DHCP. A network design that uses limited DHCP pools for different parts of the network may run out of addresses at busy locations, such as building entrances. The key diagnostic tip is that clients can join the network after associating with other APs that have addresses remaining.
8. Track PoE budgets on switches
An obscure but important monitoring item is the power-over-ethernet budget on the switches that connect APs to the network. APs that don’t receive full power will disable some functions, like frequency scanning, or will shut down radios. You’ll want to check for radio interfaces that are marked as admin up, but operationally down.
Long cable runs can deliver low power to APs as well. This situation can arise when an older AP is replaced with a newer AP that requires more power than a long cable run will support.
9. Monitor packet loss and latency
I always recommend that active path testing be used to measure network health. High packet loss and high latency have a significant impact on network performance. Note:
more than 0.0001 percent packet loss is cause for investigation. The latest innovation in network health monitoring is known as Digital Experience Monitoring (DEM), which monitors multiple facets of a network from the perspective of the end-user system and application performance. We’ve successfully used DEM tools to identify problems that were blamed on the corporate network, but which were due to a poorly performing home wireless network.
Summary
The monitoring tips that I’ve provided come from real-world experiences. Which tips have you already implemented? Are there tips that might be associated with a problem your wireless network is experiencing? With this as a guideline, you should gain better visibility into your wireless network and proactively resolve problems before they impact productivity.