Connect with us


Google’s language techniques help O2 Czech Republic reveal network secrets



Czech VR firm is bringing sports bootcamp to the masses
Among the VR standouts at CES 2018, this company is bringing advanced athletic training to the masses.

O2 Czech Republic has demonstrated that Word2vec, a neural-network technique developed to understand human languages, can interpret raw cell-tower data, potentially improving network performance. 

It also hopes to develop the technique to uncover trends in customer geolocation.

The independent network provider, which licenses the O2 brand, is developing Word2vec to overcome the problem of messy, unreliable data resulting from SIM cards connecting to network base transceiver stations, says Jan Romportl, O2 Czech Republic chief data scientist.

“Anybody who talks to me from outside the industry thinks we’ve got great geolocation data about all our customers. When people learn the truth, they get very disappointed,” he tells ZDNet.

SEE: IT pro’s guide to the evolution and impact of 5G technology (free PDF)    

The problem is that network base stations were never designed to provide meaningful location data. Their connections to individual devices can appear quite random, and many handovers between cells are not recorded.

A known route, such as a journey by train, appears to jump unpredictably between base stations, according to the recorded data, making it very difficult to pinpoint the location from this source alone. GPS data, meanwhile, is only available to phone operating-system providers and apps with which customers have agreed to share the data.

The O2 Czech Republic data-science team wanted to use records of contact between SIM cards and base stations to segment its customers based on their patterns of movement, but it also wanted to use the data to improve network performance.

Having grappled unsuccessfully with these problems, the team turned to Word2vec, developed by researchers led by Tomáš Mikolov at Google, to find out if it could reveal the locations of those base stations from raw network data without additional tagging or interpretation.

Word2vec is a group of machine-learning models that express words as vectors, typically in 100 or more dimensions, based on analysis of a corpus of data, such as the text from Wikipedia.

The process produces word embeddings, which data scientists can manipulate to create linguistically meaningful abstractions. For example, the vector of ‘Queen’ is almost equal to ‘King + Woman – Man’.

The technique is not normally used outside natural-language processing. But O2 Czech Republic’s data-science team thought it might help interpret the corpus of data it collects from SIM cards connecting to base stations.

“We used absolutely no other information; just plain text of the cell ID tokens,” Romportl says.

The team used Word2vec for each cell, creating a 100-dimensional vector for each of the 50,000 cell IDs. The problem was then to reduce the number of dimensions to produce a meaningful interpretation of the data.

Having read research published in 2018, one data scientist on the team suggested a new algorithm called Uniform Manifold Approximation and Projection for Dimension Reduction (UMAP).

“We had no idea how it worked. We just took the default parameters we needed to reduce 100-dimensional space to a 2D space and just did the scatter plot,” Romportl says.

They were amazed by the results.

“It was the best things I’ve seen in my data-science career. If you flip from the scatter plot to look at the map of the Czech Republic, you can see the reduction was able to create the longitude and latitude coordinates of each tower,” he says.

“That data was not in the original state. It was just a stream of tokens. The neural network is a universal algorithm for dimensionality reduction. It compressed all invisible patterns into 100D space, all the patterns that relate to the location of the base stations. It was a eureka moment for us.”

O2 Czech Republic already knew the location of its base stations, but the findings presented at Teradata Universe EMEA Conference 2019 Madrid demonstrate that Word2vec can be developed to reveal other hidden characteristics of the network, to help improve its performance and customer experience, he says.

The team is also planning to use a related technique, Doc2Vec, to group customers into segments based on their journey patterns, helping outside partners in marketing and public-sector planning, for example.

Although Word2vec has been used outside language processing, O2 Czech Republic’s approach to geospatial data is probably a first, says James Kobielus, lead analyst for data science at research company Wikibon.

“These methods have been kicking around for a while, but what the O2 people are doing sounds very interesting. It’s not anything I’ve seen done elsewhere and as far as I can tell it is an innovation in the application of Word2vec,” he says.

SEE: Sensor’d enterprise: IoT, ML, and big data (ZDNet special report) | Download the report as a PDF (TechRepublic)

O2 Czech Republic’s work with Word2vec shows why data scientists should be allowed to experiment, says Torsten Volk, industry analyst at Enterprise Management Associates.

“Data scientists are rare and cost a lot of money to hire. Businesses think they had better produce something that works, so they tend to use established techniques that produce results. But they are generally not exploring and finding new things.”

Organizations hoping to find value in the increasing volumes of data they collect could benefit from a more opened-ended approach to data sciences, exploring new applications of machine-learning techniques, as O2 Czech Republic has done, he says.

Or they could wait for the competition to do it first. 

Umap scatter plot compared to a map of the Czech Republic.

Uniform Manifold Approximation and Projection (UMAP) scatter plot compared to a map of the Czech Republic.

Image: Jan Romportl/O2 Czech Republic

Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published.


Today’s Wordle Answer #472 – October 4, 2022 Solution And Hints



The answer to today’s Wordle puzzle (#472 – October 4, 2022) is bough, which is what you call a branch, especially the main branch, of a tree. The word bough has roots (no pun intended) in the Old English word “bōg,” which means shoulder, similar to Old High German’s “buog,” which means the same thing (via Etymonline). There’s a popular Roman myth about the Golden Bough, which is a tree branch with golden leaves that enabled the trojan hero Aeneas to travel safely through the land of the dead. 

We solved the puzzle in three tries today, kicking things off with an expert-endorsed starter word, slate. We tried the word brush next, which turned out to be a really lucky guess with three green tiles. The answer was apparent by the third guess, and since we also solved the puzzle in three guesses yesterday, that begins a three-try streak that we hope we can continue tomorrow!

Continue Reading


How To Display iPhone 14 Pro’s Dynamic Island On Any Android Device



You can also choose whether to display the cutout at the center of the display (for hole-punch cameras on the center of the display) or on the left for cameras placed in the corner. Remember that as you increase or decrease the cutout size, the icons shown in it will also scale to match. Thankfully, the app gives you a preview of the cutout when you are changing the settings.

You can also modify gestures such as single tap or long press. Dynamic Spot also allows you to change the default time, after which the pop-up automatically disappears. Additionally, you can fiddle with a lot of appearance-related settings, such as the animation when the Dynamic Island clone pops up or unfolds.

Just as on the iPhone 14 Pro, the Dynamic Spot on your Android app will show the app icon when a new notification arrives. You may selectively choose which apps display the notifications or allow all apps of them. You can also tap on the app’s icon to open the notification or long-press the icon to preview the notification.

Continue Reading


The 10 Wildest Features Of The Mercedes Maybach Off-Roader



Sustainability is a word on every car manufacturer’s radar right now, with more focus being given to the idea of eco-friendly vehicles than ever before. The Off-Roader plays into that theme by featuring a prominent set of solar panels mounted on its hood, which could be used to generate power to extend the range of the car. It’s worth pointing out that this is all hypothetical, as the show car is non-functional, and has no drivetrain. Mercedes is keen to stress, though, that if the car did have a drivetrain, it would be all-electric, although no detail is given on the power or range that would be available to drivers.

The solar panels are interwoven with yet more Maybach logos, and their tinted finish makes them blend in almost seamlessly with the rest of the hood. It’s been pointed out by industry analysts that adding solar panels to cars is not always as environmentally friendly as it might seem, as the panels are only able to generate a very small amount of power. That power can easily be consumed by the added A/C strain caused by parking a car out in the sun all day to charge it. Car-mounted solar panels might be a flawed idea in practice, but even so, it’s interesting to see how Abloh was able to inconspicuously add them in without compromising the overall look of the car.

Continue Reading