Connect with us


Google’s language techniques help O2 Czech Republic reveal network secrets



Czech VR firm is bringing sports bootcamp to the masses
Among the VR standouts at CES 2018, this company is bringing advanced athletic training to the masses.

O2 Czech Republic has demonstrated that Word2vec, a neural-network technique developed to understand human languages, can interpret raw cell-tower data, potentially improving network performance. 

It also hopes to develop the technique to uncover trends in customer geolocation.

The independent network provider, which licenses the O2 brand, is developing Word2vec to overcome the problem of messy, unreliable data resulting from SIM cards connecting to network base transceiver stations, says Jan Romportl, O2 Czech Republic chief data scientist.

“Anybody who talks to me from outside the industry thinks we’ve got great geolocation data about all our customers. When people learn the truth, they get very disappointed,” he tells ZDNet.

SEE: IT pro’s guide to the evolution and impact of 5G technology (free PDF)    

The problem is that network base stations were never designed to provide meaningful location data. Their connections to individual devices can appear quite random, and many handovers between cells are not recorded.

A known route, such as a journey by train, appears to jump unpredictably between base stations, according to the recorded data, making it very difficult to pinpoint the location from this source alone. GPS data, meanwhile, is only available to phone operating-system providers and apps with which customers have agreed to share the data.

The O2 Czech Republic data-science team wanted to use records of contact between SIM cards and base stations to segment its customers based on their patterns of movement, but it also wanted to use the data to improve network performance.

Having grappled unsuccessfully with these problems, the team turned to Word2vec, developed by researchers led by Tomáš Mikolov at Google, to find out if it could reveal the locations of those base stations from raw network data without additional tagging or interpretation.

Word2vec is a group of machine-learning models that express words as vectors, typically in 100 or more dimensions, based on analysis of a corpus of data, such as the text from Wikipedia.

The process produces word embeddings, which data scientists can manipulate to create linguistically meaningful abstractions. For example, the vector of ‘Queen’ is almost equal to ‘King + Woman – Man’.

The technique is not normally used outside natural-language processing. But O2 Czech Republic’s data-science team thought it might help interpret the corpus of data it collects from SIM cards connecting to base stations.

“We used absolutely no other information; just plain text of the cell ID tokens,” Romportl says.

The team used Word2vec for each cell, creating a 100-dimensional vector for each of the 50,000 cell IDs. The problem was then to reduce the number of dimensions to produce a meaningful interpretation of the data.

Having read research published in 2018, one data scientist on the team suggested a new algorithm called Uniform Manifold Approximation and Projection for Dimension Reduction (UMAP).

“We had no idea how it worked. We just took the default parameters we needed to reduce 100-dimensional space to a 2D space and just did the scatter plot,” Romportl says.

They were amazed by the results.

“It was the best things I’ve seen in my data-science career. If you flip from the scatter plot to look at the map of the Czech Republic, you can see the reduction was able to create the longitude and latitude coordinates of each tower,” he says.

“That data was not in the original state. It was just a stream of tokens. The neural network is a universal algorithm for dimensionality reduction. It compressed all invisible patterns into 100D space, all the patterns that relate to the location of the base stations. It was a eureka moment for us.”

O2 Czech Republic already knew the location of its base stations, but the findings presented at Teradata Universe EMEA Conference 2019 Madrid demonstrate that Word2vec can be developed to reveal other hidden characteristics of the network, to help improve its performance and customer experience, he says.

The team is also planning to use a related technique, Doc2Vec, to group customers into segments based on their journey patterns, helping outside partners in marketing and public-sector planning, for example.

Although Word2vec has been used outside language processing, O2 Czech Republic’s approach to geospatial data is probably a first, says James Kobielus, lead analyst for data science at research company Wikibon.

“These methods have been kicking around for a while, but what the O2 people are doing sounds very interesting. It’s not anything I’ve seen done elsewhere and as far as I can tell it is an innovation in the application of Word2vec,” he says.

SEE: Sensor’d enterprise: IoT, ML, and big data (ZDNet special report) | Download the report as a PDF (TechRepublic)

O2 Czech Republic’s work with Word2vec shows why data scientists should be allowed to experiment, says Torsten Volk, industry analyst at Enterprise Management Associates.

“Data scientists are rare and cost a lot of money to hire. Businesses think they had better produce something that works, so they tend to use established techniques that produce results. But they are generally not exploring and finding new things.”

Organizations hoping to find value in the increasing volumes of data they collect could benefit from a more opened-ended approach to data sciences, exploring new applications of machine-learning techniques, as O2 Czech Republic has done, he says.

Or they could wait for the competition to do it first. 

Umap scatter plot compared to a map of the Czech Republic.

Uniform Manifold Approximation and Projection (UMAP) scatter plot compared to a map of the Czech Republic.

Image: Jan Romportl/O2 Czech Republic

Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *


The Best Features Of The Aston Martin Vulcan



Although the Vulcan was specifically designed not to be road legal, one owner decided that they wanted to stick on some license plates and take it on the highway anyway. Except, it was far from that simple, as the conversion process required making some major changes to the car, and cost several hundred thousand dollars on top of the original purchase price (via Motor1). The street conversion was handled by RML Group but had full support from the Aston Martin factory, and after completion, it became the only road-legal Vulcan in existence.

Among the litany of changes required were the addition of windshield wipers, side mirrors, and a central locking system. Michelin road tires were also fitted, and a new set of headlights had to be installed to meet height requirements for British roads. The bladed tail lights were also covered over for safety, and a few of the sharper surface edges around the cabin were smoothed out. Then, the engine was remapped to meet emissions requirements, the suspension was softened, and a lift system was installed to give the car extra clearance for speed bumps. After all that, plus a few final touches, a license plate was fitted and the car was ready to go. Unfortunately, it seems like the owner’s enthusiasm for taking it on the road quickly evaporated, as checking the car’s plates against the British government database shows that its MOT (the annual national roadworthiness test) certificate expired back in January 2022.

Continue Reading


5 Cars Owned By Bob Seger That Prove He Has Great Taste



Pulling into the final spot on the list is a 1969 Shelby Cobra GT350 Fastback. This particular car is unique for a few reasons. First, it was the last “new original” Shelby that Ford would produce. The GT350 and GT500 released in 1970 weren’t actually new or original but re-VIN’d production cars from the previous year. Also, during the summer of ’69, Carrol Shelby ended his association with Ford (via MustangSpecs).

It had one of Ford’s new 351 Windsor V8 engines with a 470 CFM four-barrel Autolite carburetor under the hood that pounded out 290hp and 385 lb-ft of torque. Its 0 – 60 time was a modest 6.5 seconds, and it did the quarter mile in 14.9 seconds (via MustangSpecs).

According to MustangSpecs, it was typically mated to a 4-speed manual transmission, but Seger’s had a Tremec 6-speed stick instead (via Mecum Auctions). Seger’s Candy Apple Red GT350 had Ford’s upgraded interior package, flaunting a landscape of imitation teak wood covering the dash, steering wheel, door accents, and center console trim (via MustangSpecs).

According to Mecum Auctions, Seger’s was number 42 of 935. When it sold at auction in 2013 for $65,000, it noted that it had been displayed at the Henry Ford Museum at the Rock Stars, Cars & Guitars Exhibit.

Continue Reading


Here’s What Made Volkswagen’s Air-Cooled Engine So Special



Engines like the Chevy Small Block, Ford 5.0, Chrysler HEMI, and Toyota 2JZ are known for power, torque, and how quickly they can propel a hunk of steel down the drag strip or around the corners of a track. The Volkswagen air-cooled engine is remembered amongst people who have owned one as reliable, easy to maintain, and as numerous as grains of sand on the beach. VW made literally tens of millions of the engine, including over 21 million in just the Beetle (via Autoweek). 

It’s difficult to nail down specific aspects of the engine’s early history as sources tend to disagree on years. But the engine can be traced back to very early Volkswagen models designed with help from Ferdinand Porsche and built in the late-1930s to early 1940s in Nazi Germany. Official sources from Volkswagen are reluctant to acknowledge use of the engine or even the existence of the Beetle prior to the end of World War II.

Continue Reading