Open Source AI is Where Big Data Was 10 Years Ago

Open Source AI is in the Same Place Big Data Was 10 Years Ago

Share

Pop quiz: Who invented the Hoverboard?

Of course the original concept was first documented in the “Back to the Future” movie trilogy, but as for the wheeled device that popped up all over malls and city streets in 2015, the answer is fairly complicated. In fact, despite being the No. 1 selling toy in that year’s holiday season, it’s difficult to name even a single company that manufactured the product. And there’s a reason why: open-source development.

The toy was in fact invented in Shenzhen, China’s engineering and manufacturing hub that accounts for nearly 1 million jobs and a disproportionate amount of China’s GDP. After work, many of the city’s engineers post their ideas on message boards, sketching out for each other new concepts they’re working on. They all work together to improve on the product, and then they all go back to their respective companies and all make the product. At the product’s peak, there were more than 1,200 hoverboard suppliers in Shenzhen.

While this level of collaboration can certainly be profitable in the short term, it also presents challenges. A similar scenario faces many businesses looking to get into AI and deep learning. As the industry currently stands, there are two options. There are limited off-the-shelf products for companies seeking to buy and integrate deep learning models and applications for their business — IBM’s Watson being the most recognizable. The alternative is open-source technologies like Google’s TensorFlow and Facebook’s PyTorch for deep learning. The closed-source option can be expensive and complicated, while the open-source options, offering developers a rich, collaborative online network and tools to flesh out their deep-learning models, lack enterprise support. Google or Facebook are not in the business of providing enterprise support for either Tensorflow or PyTorch, respectively.

When all those hoverboards started having battery issues and catching on fire, who was there to call? No one. For deep learning, the results of lack of support aren’t quite so dangerous, but for enterprises, the analogy holds. With the exception of Google Cloud customers, any enterprise using TensorFlow has to do what the rest of the product’s users do: post to the message board and hope somebody answers — no support number to call and no one to come onsite and work collaboratively with your own data engineers and scientists to solve a problem or build an application. This is not a feasible solution for companies that need to quickly address issues at production scale.

There is a third option, that of the homegrown approach, where companies hire people with the skills to write the very complex code deep learning requires in-house. But the homegrown options also has three distinct problems. One, there are a limited number of these individuals with the existing skills, and many often are snatched up for high six-figure salaries by companies like Google itself. Two, it’s hard to get everyday enterprise developers, who mostly work in Java, trained to learn and work proficiently in new programming language like Python, and others. And, three, as soon as that person gets great at building deep learning models and AI applications using Python and other contemporary tools and languages, they become very marketable and could move to another company — the brain-drain can be very difficult to overcome.

So, how do companies fill the gap between the lack of support for open source and the difficulty of building in-house. The best option is to leverage the power of vendor relationships. This model has proven itself with big data, which nearly universally has adopted the open-source framework Hadoop in some capacity. Companies like Cloudera, Teradata and Hortonworks have worked to engineer new tools on top of the framework, so its users can focus on their market expertise and leave the support and SLAs to those vendors. As a result, big data has gone gangbusters at the corporate level, in spite of the lack of data scientists in the workforce.

For companies seeking to gain a competitive advantage in the field of deep learning, it’s time to take the same approach. By seeking out vendors that can work closely with internal teams to spin up deep-learning projects, companies can avoid waiting months on messaging boards or spending money on very expensive hires for answers to their enterprise-level AI questions.

(Author):
Atif Kureishy

Atif is the Global VP, Emerging Practices, Artificial Intelligence & Deep Learning at Teradata.

Based in San Diego, Atif specializes in enabling clients across all major industry verticals, through strategic partnerships, to deliver complex analytical solutions built on machine and deep learning. His teams are trusted advisors to the world’s most innovative companies to develop next-generation capabilities for strategic data-driven outcomes in the areas of artificial intelligence, deep learning & data science.

Atif has more than 18 years in strategic and technology consulting, working with senior executive clients. During this time, he has both written extensively and advised organizations on numerous topics, ranging from improving the digital customer experience to multi-national data analytics programs for smarter cities, cyber network defense for critical infrastructure protection, financial crime analytics for tracking illicit funds flow, and the use of smart data to enable analytic-driven value generation for energy & natural resource operational efficiencies.

View all posts by Atif Kureishy

Follow Connect