The Applied AI Summit is a great place to get the latest insights on deep learning algorithms, robotic process automation, and use cases and business models powered by cognitive technologies. With exciting AI topics like these, the conversation around storage is easily placed on the backburner - but you can’t change the world if your infrastructure can’t keep up with your AI initiatives!
The dizzying pace of innovation in core AI technologies like GPU, machine learning platforms, and deep learning frameworks can make this seem like a golden age for data scientists. In many ways this is true, but behind the scenes there’s often a dismal reality that can be more frustrating than exciting. Data management remains largely manual, and fragmented infrastructures keep data stuck in silos. Model training can take up to weeks to complete, so costs can easily soar out of control. The future remains bright—but getting there from here can be an extremely slow and costly process.
To understand how we got here, consider the relationships among the three main drivers of AI: deep learning neural networks, parallel compute technologies, and the availability of massive amounts of data.
AI algorithms - such as deep learning - leverage massively parallel neural networks inspired by the human brain—but to run them, you need compute resources that can replicate that massively parallel processing.
Modern multi-core GPUs provide the massively parallel processing needed to put AI algorithms to work—but they can’t do it by themselves. They need to be implemented in server systems designed to run AI frameworks. However, they need access to massive amounts of data.
While there is more conversation and excitement around AI technology, it’s the data itself that really drives value. The more data you have, the more value and insights you are able to extract from AI technology. In 2015, worldwide 12 zetabytes of data was created globally. The number is forecast to reach 163 zetabytes in 2025. But if it ends up in outdated storage systems designed for an earlier era, that rising volume will only further slow performance. The smarter we get, the slower we’ll move—and that’s no way to change the world.
In my session, “Infrastructure Slowing your AI Projects? Nuts & Bolts of AI-Ready Infrastructure,” we’ll talk about what’s needed to keep data flowing quickly and smoothly at each stage of the AI pipeline. We’ll also discuss why it’s so hard to unify data across silos with diverse infrastructure requirements, including:
- Data warehouse appliances requiring massive throughput for files and objects
- Data lakes built on commodity servers with scale-out storage
- Streaming analytics that depend on multi-dimensional performance
- AI clusters powered by 10,000s of GPU cores to enable massively parallel processing
Other essential topics I’ll address include, changing assumptions about big data architecture with important implications for cost, performance, and agility; making the move from data lake to data hub; and how to know whether cloud is right for you depending on your own cloud journey.
Don’t let a cumbersome, ill-suited infrastructure slow your AI projects. Join me for my Applied AI Summit session, “Infrastructure Slowing Your AI Projects? Nuts & Bolts of AI-Ready Infrastructure,” and learn how you can make sure that your servers and storage systems accelerate AI projects—not slow them down.