AT&T accelerates data science with RAPIDS, Spark

AT&T’s wireless network connects more than 100 million subscribers from the Aleutian Islands to the Florida Keys, spawning a vast ocean of data.

Abhay Dabholkar leads a research group that acts as a beacon on the lookout for the best tools to navigate it.

“It’s fun, we’re playing around with new tools that can make a difference to the day-to-day work of AT&T, and when we give staff the latest and greatest tools, it adds to their job satisfaction,” said said Dabholkar, a prominent AI architect. who has been with the company for over a decade.

Recently, the team tested on GPU-powered servers the NVIDIA RAPIDS Accelerator for Apache Spark, software that distributes work among the nodes of a cluster.

It processed a month’s worth of mobile data – 2.8 trillion lines of information – in just five hours. It’s 3.3 times faster and 60% cheaper than any previous test.

An amazing moment

“It was a mind-blowing moment because on CPU clusters it takes over 48 hours to process just seven days worth of data – in the past we had the data but couldn’t use it because it took a long time to process it “, he said.

Specifically, the test compared what’s called ETL, the extract, transform, and load process that cleanses the data before it can be used to train the AI ​​models that unveil new insights.

“Now we believe that GPUs can be used for ETL and all sorts of batch processing workloads that we do in Spark, so we are exploring other RAPIDS libraries to extend the feature engineering work to the ‘ETL and machine learning,’ he said.

Today, AT&T runs ETL on CPU servers and then transfers the data to GPU servers for training. Doing it all in one GPU pipeline can save time and money, he added.

Satisfy customers, accelerate network design

The savings could appear in a wide variety of use cases.

For example, users could find out more quickly where they get the best connections, improving customer satisfaction and reducing churn. “We could also decide on settings for our 5G towers and antennas more quickly,” he said.

Identifying the AT&T fiber footprint area to deploy a support truck can require time-consuming geospatial calculations, which RAPIDS and GPUs could speed up, said Chris Vo, a senior team member. who supervised the RAPIDS tests.

“We’re probably getting 300 to 400 terabytes of fresh data a day, so this technology can have an incredible impact – the reports we generate over two or three weeks could be done in hours,” Dabholkar said.

Three use cases and counting

The researchers are sharing their findings with members of AT&T’s Data Platform team.

“We recommend that if a job is taking too long and you have a lot of data, enable GPUs — with Spark, the same code that runs on CPUs runs on GPUs,” he said. .

So far, separate teams have found their own gains in three different use cases; other teams also plan to test their workloads.

Dabholkar is optimistic that business units will feed their test results to production systems.

“We’re a telecommunications company with all kinds of datasets processing petabytes of data daily, and that can dramatically improve our economics,” he said.

Other users, including the US Internal Revenue Service, follow a similar path. It’s a route many will take given that Apache Spark is used by over 13,000 companies, including 80% of the Fortune 500.

Register for GTC for free to hear AT&T’s Chris Vo talk about his work, learn more about data science in these sessions, and hear keynote speech from NVIDIA CEO Jensen Huang.

Comments are closed.