Fluke Asks: What's Your Mean-Time-To-Know?
Integrating the separate disciplines of network and application performance monitoring.
Network administrators: How long does it take your staff to determine the root cause of an issue? Your "Mean-Time-To-Know" is a key metric that Fluke uses when referring to their newly minted Visual TruView offering, according to Daryle DeBalski, VP & GM of Visual Network Systems (Danaher Corporation owns Visual and Fluke, among other companies).
TruView is an appliance that uses key data sets such as stream-to-disk packet storage, application response time, transactional decode, IPFIX (NetFlow), and SNMP to present analytics through a single reporting interface. As TruView processes analytics from these data sets, it time-correlates the results.
TruView integrates the separate disciplines of network and application performance monitoring, and features a single, correlated dashboard that makes it easy for engineers to drill down to individual or group of flows, packets or transactions of any user, site or period of time in a few clicks. Automatic identification and configuration of applications and networks takes 15 minutes. The starting price of TruView is $25k.
In 2004, Viola Networks sent me their NetAlly assessment tool to evaluate, and I remember using their tool to prove to people the ineffectiveness of LAN hubs in VoIP configurations. (See: Ready! Getting The IP-PBX To Work: Plan, Then Assess) Viola is one of several acquisitions by Danaher whose technology makes up TruView.
The starting price tag may come as sticker shock to most SMBs. But on a recent encounter where we spent significant hours troubleshooting a project involving consultants, data center, carriers and then IT partners, that $25k starting price quickly started to seem insignificant. In our initial meetings with the concerned parties nearly two years ago, we'd stated that adequate tools were lacking.
Without adequate tools, contention develops between concerned parties, and I think Fluke's approach of bringing analytics for VoIP, network and application performance to a dashboard in a collaborative picture makes sense. For several years, we told a particular customer, "You have data center issues." We knew the network and low utilization weren't the culprits but didn't have the tools to prove it. We used probes that pointed the issue towards the data center but had nothing on the specific issue, whether it was server resources or application. Performance was hidden and the lack of metrics hindered problem resolution. Because the project involved health care, resolution became a priority, particularly once Doctors experienced what we had already been reporting for several years on behalf of their staff. The problem turned out to be licensing for concurrent user sessions. How much loss in productivity and tech time were incurred?
The other lingering ailment found was that the VMware solution wasn't properly configured either, and again the days spent by all parties over several years didn't lead to quick resolutions, let alone any respectable Mean-Time-To-Know. The issues became the status quo, and the staff experience of lousy service became the norm. Personnel turnovers out of frustration also plagued the practice. Patients were backlogged in waiting rooms, unable to see doctors because of problems preventing the office from using Electronic Medical Records or the backend system used in procedures.
Reflecting on these experiences, everyone had tools but none of the tools collaborated with the analytics. The analytics on hand didn't point to the root cause. The root cause was often masked because, as is shown in Fluke's slide above--"It's not the network," "It's not the application," and "It's not the server" were common responses to the issues reported by responsible parties.