Connect with us


Ciena uses machine learning to heal the scars, horror of network management



Does machine learning still need people?
Vijay Raghavan, executive vice president and chief technology officer Risk and Business Analytics for The RELX Group, talks with Tonya Hall about a balance between analytics and intuition.

“The scars” and “that horrible world” are some of the terms for network management, according to one who’s been in the trenches. 

Kailem Anderson was with Cisco Systems for 12 years prior to joining fiber-optics giant Ciena last year. As vice president of portfolio and engineering for the Blue Planet, a software division of Ciena, he is trying to help avoid such pain for those who must keep networks running. 

“I managed customer networks, and I spent a lot of time hiring analysts to watch the network, to watch alarms, and to build big strings of rules,” for networking monitoring, says Anderson. His breezy Aussie accent gives a certain lightness to what sounds like a rather miserable affair.

At $26 million in revenue in 2018, Blue Planet was a tiny fraction of Ciena’s roughly $200 million in software revenue in 2018 and $3 billion in total revenue. But it increased by a healthy 66%, and it can bring higher profit margin than Ciena’s optical networking equipment sale. It also offers the company a recurring revenue stream that is highly appreciated by Wall Street. Those economic aspects, plus the fact that it can be strategic in designing customers’ networks, make it an important part of where Ciena is headed as a company.   

Also: Is Google’s Snorkel DryBell the future of enterprise data management?

Figuring out what’s gone wrong in a network involves detective work at several levels of what’s known as the “stack” of protocols, the Open Systems Interconnect, or “OSI.” Some information comes from the bottom of the stack, if you will, the “layer one,” which consists of the physical medium of transmission. That could be, for example, coaxial cabling or fiber-optic links. 

At the next layer above that, layer two, raw bits are packaged into bundles, such as Ethernet frames, and there’s all kinds of information to be gleaned about the state of those frames of data as they move through the fibers and cables of the network. The next layer up is layer three, where data is packaged as Internet-addressable packets, again, with lots of their owing information to be gleaned, such as routing and switching information about where the packets are going. 

From there, one can go on up to higher levels, layers four through seven, the domain of applications, and get information about who an individual application is placing its data into those internet packets and whether it is having any trouble doing so. 

Take the example where there is an transponder failure on one of two optical links. That leads to a route change in the multi-protocol label system, or MPLS. The network equipment reports congestion along the IP route as a link shoulders the burden of more traffic, and an end user experiences heavy delays using the network. All these are part of the same problem, Anderson explains, but getting from the user experience to the transponder failure can be a mystery. 

Traditionally, a systems administrator sees the various items in a disparate fashion, with signals at each of the OSI layers coming from different telemetry systems, such as SNMP monitors, the systems log, a third thing that tracks “flows,” and then information coming from an individual piece of equipment, such as information about a recent configuration change — none of which are coordinated. 

What looks like bad user performance from one angle looks like an MPLS routing issue or an IP bandwidth issue at another level, leading to a serious piece of detective work to find the culprit, the transponder failure. 

Also: Google Brain, Microsoft plumb the mysteries of networks with AI

A ticket gets created, and it ping-pongs between teams, with no one team having visibility into the other side, says Anderson. “Eventually they solve it, they have engineers inspect the matter, but it’s very inefficient.”

Sys admins must try and construct systems of rules as to what every possible combination of factors could mean. “They spend 1,000s of hours building these rules,” says Anderson. “It’s a zero sum game to spend that time to identify all the different scenarios.”

Instead, Blue Planet tools can train the network software using a combination of labeled examples, known as supervised learning and reinforcement learning, where the computer explores states of affairs and possible next steps. 

With that combination, the software can be trained to identify patterns “up and down the stack” that are difficult to piece together with a rules-based system. 

“We want to have the system learn to identify those scenarios, to basically help us get to the root cause much more quickly, and to use that information to close the loop,” he says, and then have a supervisor come into the picture only once that outline has been determined. 

Also: Intel-backed startup Nyansa chases the total problem in the AI of network monitoring

The tools necessary to do this are mostly starting from off-the-shelf machine learning models, says Anderson. “Most of this, yes, we can get from the cloud guys,” he says, referring to the various enterprise-grade machine learning offerings in cloud computing facilities. “We use them all,” though the tools can also be run solely on-prem. “It’s six and one half dozen of the other at the moment, but I think analytics is ultimately a good thing to move into the cloud.”

Open-source tools such as SparkML play a big role in organizing all the telemetry data. 

The technology of machine learning, says Anderson, has matured substantially in recent years to make the investment in labeling network events pay off. 

“Five years ago I was playing with this and with the amount of effort that needed to go into labeling, the risk versus value I was getting was questionable,” he says. “With the hardening of the algorithm, and the maturity of AI, that effort-to-reward ratio has compressed significantly. You only have to do a reasonable amount of tagging now and the outputs are significant.”

Anderson maintains there is another dimension in the shift to machine learning, which is that a more comprehensive sense of the network emerges that may lead to different ways or structuring and maintaining networks. 

Traditionally, many sys admins will simply turn off sources of information, says Anderson, which is understandable, because of the information overload, but it means that network administrators are throwing away important clues. 

“That’s the complexity in operating with a million different data sources,” he observes. “The traditional way to mange an operations team is to filter the information, almost turn off the information that is too much.

“At Cisco, if I was running a service provider network, I would get in the vicinity of a million events a day, and I might have an operations team of 40 to 50 people who have to handle all that.”

As a consequence, admins end up only looking for “what they deem fair scenarios,” and “are turning off performance-based scenarios,” information about the relative quality of the network. 

But, says Anderson, “you don’t want to turn off the information, you want to funnel it, and use it to identify what conditions are driving consistent scenarios,

Must read

“Eventually, solutions could be different if they’re trained,” he offers. Data may lead to structuring things differently. “Usually, you have a planned network condition, but then an actual network condition; through learning, you might find the actual is more optimal than planned, and then execute a policy” based on that new insight. 

There are new frontiers to achieve, such as delivering analysis of the data in a “graph database” format, says Anderson. “We are in the operations and network world, and so you want to visualize all this in a network graph concept.” Some customers “want to see it just programmatically propagate to northbound systems that are going to leverage that information, to be able to visualize with a graph database and have APIs to send that northbound information to the BSS layer.”

The one catch at the moment in all this is that systems administrators are not yet ready to close the loop, so to speak, and let machine learning completely take over and automate both detection and resolution of network issues. 

“This isn’t a tech limit, it’s a cultural aspect,” he says. Machine learning systems are probabilistic, not deterministic. Hence, while they can detect many failure issues, there is a reluctance to automate what could be a false positive scenario. “You only need to screw up .0001% of the time and that’s a big issue.”

“I still think we are a little bit away in terms of closing the loop, I think it’s trust in the technology. It will happen incrementally, where you can close the loop on something non-catastrophic, that doesn’t create a failure scenario, where there is low risk, and then other areas over time

Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published.


Today’s Wordle Answer #472 – October 4, 2022 Solution And Hints



The answer to today’s Wordle puzzle (#472 – October 4, 2022) is bough, which is what you call a branch, especially the main branch, of a tree. The word bough has roots (no pun intended) in the Old English word “bōg,” which means shoulder, similar to Old High German’s “buog,” which means the same thing (via Etymonline). There’s a popular Roman myth about the Golden Bough, which is a tree branch with golden leaves that enabled the trojan hero Aeneas to travel safely through the land of the dead. 

We solved the puzzle in three tries today, kicking things off with an expert-endorsed starter word, slate. We tried the word brush next, which turned out to be a really lucky guess with three green tiles. The answer was apparent by the third guess, and since we also solved the puzzle in three guesses yesterday, that begins a three-try streak that we hope we can continue tomorrow!

Continue Reading


How To Display iPhone 14 Pro’s Dynamic Island On Any Android Device



You can also choose whether to display the cutout at the center of the display (for hole-punch cameras on the center of the display) or on the left for cameras placed in the corner. Remember that as you increase or decrease the cutout size, the icons shown in it will also scale to match. Thankfully, the app gives you a preview of the cutout when you are changing the settings.

You can also modify gestures such as single tap or long press. Dynamic Spot also allows you to change the default time, after which the pop-up automatically disappears. Additionally, you can fiddle with a lot of appearance-related settings, such as the animation when the Dynamic Island clone pops up or unfolds.

Just as on the iPhone 14 Pro, the Dynamic Spot on your Android app will show the app icon when a new notification arrives. You may selectively choose which apps display the notifications or allow all apps of them. You can also tap on the app’s icon to open the notification or long-press the icon to preview the notification.

Continue Reading


The 10 Wildest Features Of The Mercedes Maybach Off-Roader



Sustainability is a word on every car manufacturer’s radar right now, with more focus being given to the idea of eco-friendly vehicles than ever before. The Off-Roader plays into that theme by featuring a prominent set of solar panels mounted on its hood, which could be used to generate power to extend the range of the car. It’s worth pointing out that this is all hypothetical, as the show car is non-functional, and has no drivetrain. Mercedes is keen to stress, though, that if the car did have a drivetrain, it would be all-electric, although no detail is given on the power or range that would be available to drivers.

The solar panels are interwoven with yet more Maybach logos, and their tinted finish makes them blend in almost seamlessly with the rest of the hood. It’s been pointed out by industry analysts that adding solar panels to cars is not always as environmentally friendly as it might seem, as the panels are only able to generate a very small amount of power. That power can easily be consumed by the added A/C strain caused by parking a car out in the sun all day to charge it. Car-mounted solar panels might be a flawed idea in practice, but even so, it’s interesting to see how Abloh was able to inconspicuously add them in without compromising the overall look of the car.

Continue Reading