Next Big Takeaways from Databricks Data + AI Summit and Apple WWDC 2024
Notable announcements from the consumer giant and data infrastructure titan during their flagship conferences could change the AI landscape for years to come.
It was a bustling time in the Bay Area last week with both Apple’s World Wide Developer Conference and Databricks’ Data + AI Summit running concurrently. Both events unveiled exciting announcements that are set to significantly transform the AI landscape across consumer and enterprise. This post captures my major takeaways from both conferences.
1. Entering the age of interoperability for data workloads
Databricks’ flagship conference is aptly named “Data + AI Summit” since Data and AI are inextricably linked. As such, I’ll cover one major data takeaway as well one major AI takeaway from the Summit, starting first with data.
Over the past few years, the concept of a format agnostic Lakehouse architecture has been gaining steam within the industry, with the Databricks team being a vocal champion of this approach. At a very basic level, Lakehouse architecture (visual below) is a single platform with a metadata and governance layer that combines the key benefits of data lakes (large repositories of raw data stored in original form) and data warehouses (organized sets of structured data). Here is the technical paper and here is an easier to understand layman’s explanation of Lakehouse architecture.
Excitement for Lakehouses has grown primarily because this type of architecture has the potential to unlock greater interoperability. Put simply, this means that multiple technologies now have the ability to safely operate over a single copy of data. This in turn could yield significant benefits for customers including less vendor lock-in and more unified governance.
This discussion of Lakehouses becoming the architecture of the future was again thrust into the spotlight following the keynote announcement of Databricks open-sourcing its Unity Catalog. This comes hot on the heels of Snowflake announcing that it will be open-sourcing its Polaris Catalog. Additionally, Databricks announced its acquisition of Tabular (original creators of Apache Iceberg) the week prior to the Summit. Databricks had previously worked with the Linux Foundation to create the Delta Lake project, and the Tabular acquisition is significant since it brings together the creators of two leading and fast-growing open-source Lakehouse formats:
As a VC and open-source advocate, I’m immensely excited by these developments and the potential for startups in this new age of interoperability for data workloads. Many more thoughts to come in a longer post on this topic!
2. Reinforcement of “Compound AI Systems” as a core paradigm
Now let’s shift the focus slightly to the AI side. LLMs caught the world by storm following the launch of ChatGPT in late 2022. Consequently, a lot of the dialogue within AI application development has been focused on foundation models being the primary element for driving state-of-the-art results. While foundation models can certainly be seen as the new “oil” powering AI applications, the dialogue from AI app developers is shifting away from single, monolithic models toward constructing compound systems with multiple, interacting components (such as retrieval, tool use, and agents). For instance, as my colleagues and I highlighted in our recent AI Infrastructure Roadmap, techniques such as RAG have been rising in popularity within the enterprise AI Infrastructure stack.
Earlier this year, the Databricks team, in collaboration with other AI researchers, published a white paper coining this concept as a shift from models to “Compound AI systems”. This term was well socialized during the Summit, with Mosaic AI’s new feature launches centered upon building and deploying production-quality Compound AI Systems, and a comprehensive workshop session focused on this theme.
This is all very exciting since such Compound AI systems could help to bridge the performance gap between smaller proof-of-concept projects and scalable in-production deployments. Currently, this is a common friction point preventing AI companies from fully delivering on their promises to enterprise customers. An entire AI infrastructure ecosystem has already blossomed here and I’m excited to see how the tool chain continues to evolve in the coming months:
3. An era of hyper-personalized AI for consumers
Switching gears to consumer, let’s discuss what might be in store for the consumer AI landscape following Apple’s WWDC. This year’s WWDC was highly anticipated as many in the industry were expecting Apple to concretely reveal its AI strategy. Apple’s stock had rallied in May in the leadup to the conference.
Indeed, AI was the focus of WWDC 2024 with key announcements around Apple Intelligence and the company’s partnership with OpenAI. As I highlighted during my interview with LinkedIn News, Apple focused on a theme of personalized context throughout its AI narrative. Apple is undoubtedly one of the most well-positioned companies in the world to build hyper-personalized AI for consumers. It’s astonishing to think about how much data lives within an Apple device — texts, images, schedules, emails, contacts, location, health data, and even sleep patterns! All of these data sources in tandem provide valuable competitive advantage as they can serve to unlock an unprecedented level of tailored intelligence and agentic AI potential. The possibilities here are expansive, but also raise real questions about security and privacy (see Elon’s tweets below), which were topics explicitly addressed throughout WWDC. Apple highlighted that it plans to use a combination of on-device and cloud data centers to process AI workloads based on processing power requirements.
Widespread AI rollout on Apple devices is poised to have a dramatic impact as the company has a massive install base of over 2.2 billion active devices. As I had written previously in my article on the B2C2B momentum in today’s AI wave, one distinctly unique aspect about the current AI paradigm shift is how consumers have direct and first-hand exposure to the latest AI advancements. This is very different from previous AI chapters where AI companies prioritized enterprise distribution over consumer access. ChatGPT had a big role to play here by putting the power of AI directly in the hands of the masses, but Apple, with its significant global footprint, could now truly unleash the floodgates for consumer AI access and excitement. Reinforcing how powerful Apple’s distribution reach is, Bloomberg reported that Apple will not be paying OpenAI directly for this partnership.
Furthermore, Apple is in a unique position to capture value across the entirety of the AI stack for a user since the company produces chips, owns the device layer, and controls the operating system. Sales for Apple devices such as the iPhone have been declining in recent quarters, and AI features could potentially help to drive upgrade demand within its core business since Apple Intelligence will only be available on the latest devices and silicon (chart above). Additionally, there is strong potential for direct AI monetization opportunities, for instance in services and within the apps ecosystem, which could present new frontiers of growth for the company.
I really like that Star History chart 👍🏽