Event Speakers

For Abstract, click on Speaker Card

Advanced Technical/Research

Business Strategy

Panel Discussion

Future of AI

Speakers Corner

Lightning Talks

(more coming soon)

Tickets

Agenda

This agenda is still subject to changes.
Talk: A Practical Guide to Efficient AI

Presenter:
Shelby Heinecke, Senior AI Research Manager, Salesforce

About the Speaker:
Dr. Shelby Heinecke leads an AI research team at Salesforce. Shelby’s team develops cutting-edge AI for product and research in emerging directions including autonomous agents, LLMs, and on-device AI. Prior to leading her team, Shelby was a Senior AI Research Scientist focusing on robust recommendation systems and productionizing AI models. Shelby earned her Ph.D. in Mathematics from University of Illinois at Chicago, specializing in machine learning theory. She also holds an M.S. in Mathematics from Northwestern and a B.S. in Mathematics from MIT. Website: www.shelbyh.ai

Talk Track: Research or Advanced Technical

Talk Technical Level: 3/7

Talk Abstract:
In the past two years, we’ve witnessed a whirlwind of AI breakthroughs powered by extremely large and resource-demanding models. And as engineers and practitioners, we are now faced with deploying these AI models at scale in resource constrained environments, from cloud to on-device. In this talk, we will first identify key sources of inefficiency in AI models. Then, we will discuss techniques and practical tools to improve efficiency, from model architecture selection, to quantization, to prompt optimization.

What You’ll Learn
TBA

Talk: AutoGen: Enabling Next-Gen AI Applications via Multi-Agent Conversation

Presenter:
Qingyun Wu, Assistant Professor at Penn State University; Creator of AutoGen, Agmax Inc

About the Speaker:
Dr. Qingyun Wu is an Assistant Professor at the College of Information Science and Technology at Penn State University. She is a post-doc researcher in the NYC Lab of Microsoft Research 2020-2021. She got her Ph.D. in Computer Science from the University of Virginia in 2020. Qingyun received the 2019 SIGIR Best Paper Award and ICLR 2024 LLM agent workshop Best Paper Award. Qingyun is the creator and one of the core maintainers of AutoGen, a leading programming framework for agentic AI applications.

Talk Track: Advanced Technical/Research

Talk Technical Level: 1/7

Talk Abstract:
AutoGen is an open-source programming framework for agentic AI. It enables the development of AI agentic applications using multiple agents that can converse with each other to solve tasks. In this session, the speaker will provide a deep dive into the key concepts of AutoGen, demonstrate diverse applications enabled by AutoGen, and share the latest updates and ongoing efforts spanning across key directions such as evaluation, interfaces, learning/optimization/teaching, and seamless integration with existing AI technologies.

What You’ll Learn:

  • Agentic AI and the core concepts of AutoGen as an open-source programming framework for agentic AI
  • How AutoGen enables the development of AI applications using multiple conversing agents
  • The architecture and key components of the AutoGen framework
  • Various applications and use cases made possible by AutoGen
  • Recent updates and ongoing developments in the AutoGen project
  • Key areas of focus in AutoGen’s development.

 

You will gain insights into how AutoGen can be used to create advanced AI applications that leverage multi-agent conversations to solve complex tasks. You will also get a glimpse of the future directions and potential impact of this technology in the field of AI.

Talk: How to Run Your Own LLMs, From Silicon to Service

Presenter:
Charles Frye, AI Engineer, Modal Labs

About the Speaker:
Charles teaches people to build data, ML, and AI applications. He got his PhD from the University of California, Berkeley, in 2020 for work on the geometry of neural network optimization. He has since worked as an educator and evangelist for neural network applications at Weights & Biases, Full Stack Deep Learning, and now Modal Labs.

Talk Track: Advanced Technical/Research

Talk Technical Level: 6/7

Talk Abstract:
In this talk, AI Engineer Charles Frye will discuss the stack for running your own LLM inference service. We’ll cover: compute options like CPUs, GPUs, TPUs, & LPUs; model options like Qwen & LLaMA; inference server options like TensorRT-LLM, vLLM, & SGLang; and observability options like the OTel stack, LangSmith, W&B Weave, & Braintrust.

What You’ll Learn
Everything about serving LLMs, from the latest and greatest open source software tooling to the fundamental principles that drive engineering constraints across the stack.

Talk: Building State-of-the-Art Chatbot Using Open Source Models and Composite Systems

Presenter:
Urmish Thakker, Director of Machine Learning, SambaNova Systems

About the Speaker:
Urmish leads the LLM Team at SambaNova Systems. The LLM team at SambaNova focuses on understanding how to train and evaluate HHH aligned large language models, adapting LLMs to enterprise use-cases and HW-SW co-design of LLMs to enable efficient training and inference. Before SambaNova, he was in various engineering and research roles at Arm, AMD and Texas Instruments. He also helped drive the TinyML Performance Working Group in MLPerf, contributing to the development of key benchmarks for IoT ML. Urmish has 35+ publications and patents focussing on efficient deep learning and LLMs. His papers have been published at top ML and HW conferences like NeurIPS, ICLR and ISCA and has been an invited speaker at various top universities and industry academia summits. He completed his masters at the University of Wisconsin Madison and his bachelors from Birla Institute of Technology and Science.

Talk Track: Advanced Technical/Research

Talk Technical Level: 4/7

Talk Abstract:
Open source LLMs like LLAMA2 and BLOOM have enabled widespread development of enterprise LLM Applications. As the models adoption has matured over-time, we have seen a rise in LLMs specialized to solve narrow domains, tasks or modalities. By adopting such specialization, these models are able to outperform far larger proprietary or open models. For example, 7-70B llama experts like UniNER, TabLLAMA, NexusRaven, SambaCoder-nsql-llama2 etc can outperform GPT-4 on NER, Function Calling, Tabular data and Text2SQL tasks. Many more such examples exist and can be found in open source. However, one unique feature that larger proprietary models offer is a single end-point that takes an user query and provides a response. These responses can sometimes also include a chain of tasks that was solved to get to such a response.

The question we try to answer in this research is whether we can build a composite LLM system using open source checkpoints that can effectively provide this same usability as a larger proprietary model. This includes taking a user request and mapping it to a single checkpoint or a group of checkpoints that can solve the request and serve the user. Our work indicates that such a composite is indeed possible. We show this by building a new state-of-the-art model based on various adaptations of the mistral 7b model. Using unique ensembling methods, this composite model outperforms Gemma-7B, Mixtral-8x7B, llama2-70B,, Qwen-72B, Falcon-180B and BLOOM-176B at an effective inference cost of <10B parameter model.

What You’ll Learn
TBA

Talk: DL-Backtrace by AryaXAI: A Model Agnostic Explainability for Any Deep Learning Models (LLMs to CV)

Presenter:
Vinay Kumar Sankarapu, Founder & CEO, Arya.ai

About the Speaker:
Vinay Kumar Sankarapu is the Founder and CEO of Arya.ai. He did his Bachelor’s and Master’s in Mechanical Engineering at IIT Bombay with research in Deep Learning and published his thesis on CNNs in manufacturing. He started Arya.ai in 2013, one of the first deep learning startups, along with Deekshith, while finishing his Master’s at IIT Bombay.

He co-authored a patent for designing a new explainability technique for deep learning and implementing it in underwriting in FSIs. He also authored a paper on AI technical debt in FSIs. He wrote multiple guest articles on ‘Responsible AI’, ‘AI usage risks in FSIs’. He presented multiple technical and industry presentations globally – Nvidia GTC (SF & Mumbai), ReWork (SF & London), Cypher (Bangalore), Nasscom(Bangalore), TEDx (Mumbai) etc. He was the youngest member of ‘AI task force’ set up by the Indian Commerce and Ministry in 2017 to provide inputs on policy and to support AI adoption as part of Industry 4.0. He was listed in Forbes Asia 30-Under-30 under the technology section.

Talk Track: Advanced Technical/Research

Talk Technical Level: 4/7

Talk Abstract:
In today’s rapidly evolving AI landscape, deep learning models have become increasingly complex and opaque. They often function as “black boxes” that make decisions without transparent reasoning. This lack of explainability raises concerns in mission-critical applications where understanding the “why” behind a model’s decision is as important as the decision itself.

In this talk, we will introduce DL-Backtrace, a new technique designed at AryaXAI to explain any deep learning model—an LLM, traditional computer vision model, or beyond. We will discuss the algorithm and benchmarking results of DL-Backtrace against techniques like SHAPE, LIME, GradCAM, etc., for various DL models like LLMs (Llama 3.2), NLP (Bert), CV (ResNet), and tabular data.

What You’ll Learn:
The importance of explainability in mission-critical functions and model pruning; The current scope of explainability for deep learning models; complexities in scaling explainability for large models like LLMs; drawbacks with various current techniques; background about DL-Backrace; Results of the method for various DL models and subsequent work.

Talk: From Paper to Production in 30 Minutes: Implementing code-less Gen AI Research

Presenter:
Aarushi Kansal, AI Engineer, AutoGPT

About the Speaker:
Aarushi is a passionate and seasoned AI engineer, currently working at AutoGPT – one of the most popular projects on GitHubm aiming to democratize AI. Previously she has initiated and lead Generative AI at Bumble as a principal engineer. She has also been a software engineer at iconic companies such as ThoughtWorks, Deliveroo and Tier Mobility.

Talk Track: Advanced Technical/Research

Talk Technical Level: 5/7

Talk Abstract:
There are new research papers in the Gen AI space, about prompting, RAG, different models, different ways to finetune almost every other day these days. Often they come with no code and in this talk we’re going to go through and implement research papers in 30 minutes.

What You’ll Learn
In this talk the audience will learn how to take a research paper and quickly implement (within 30 minutes) and then how to actually evaluate if it’s useful for their work or not.

Talk: Investigating the Evolution of Evaluation from Model Training to GenAI inference

Presenter:
Anish Shah, ML Engineer, Weights & Biases

About the Speaker:
Join Anish Shah for an in-depth session on fine-tuning and evaluating multimodal generative models. This talk will delve into advanced methodologies for optimizing text-to-image diffusion models, with a focus on enhancing image quality and improving prompt comprehension.
Learn how to leverage Weights & Biases for efficient experiment tracking, enabling seamless monitoring and analysis of your model’s performance.

Additionally, discover how to utilize Weave, a lightweight toolkit for tracking and evaluating LLM applications, to conduct practical and holistic evaluations of multimodal models.

The session will also introduce Hemm, a comprehensive library for benchmarking text-to-image diffusion models on image quality and prompt comprehension, integrated with Weights & Biases and Weave. By the end of this talk, you’ll be equipped with cutting-edge tools and techniques to elevate your multimodal generative models to the next level.

Talk Track: Advanced Technical/Research

Talk Technical Level: 2/7

Talk Abstract:
This session explores the evolution of evaluation techniques in machine learning, from traditional model training through fine-tuning to the current challenges of assessing large language models (LLMs) and generative AI systems. We’ll trace the progression from simple metrics like accuracy and F1 score to sophisticated automated evaluation systems that can generate criteria and assertions. The session will culminate in an in-depth look at cutting-edge approaches like EvalGen, which use LLMs to assist in creating aligned evaluation criteria while addressing phenomena like criteria drift.

What You’ll Learn:
Attendees will gain a comprehensive understanding of evaluation techniques across different ML paradigms, from cross-validation in traditional training to the nuances of evaluating fine-tuned models and LLMs. You’ll learn practical approaches for automating evaluation criteria and assertions, strategies for aligning these automated evaluations with human judgments, and techniques for handling the unique challenges posed by generative AI, such as criteria drift and the balance between human oversight and AI-assisted evaluation.

Talk: Can Long-Context LLMs Truly Use Their Full Context Window Effectively?

Presenter:
Lavanya Gupta, Senior Applied AI/ML Associate | CMU Grad | Gold Medalist | Tech Speaker, JPMorgan Chase & Co.

About the Speaker:
I am Lavanya, a graduate student from Carnegie Mellon University (CMU), Language Technologies Institute (LTI); and a passionate AI/ML industrial researcher with 5+ years of experience. I am also an avid tech speaker and have delivered several talks and participated in panel discussions at conferences like Women in Data Science (WiDS), Women in Machine Learning (WiML), PyData, TensorFlow User Group (TFUG). In addition, I am dedicated to providing mentorship via collaborations with multiple organizations like Anita Borg.

Talk Track: Advanced Technical/Research

Talk Technical Level: 6/7

Talk Abstract:
Recently there has been a growing interest in extending the context length (input window size) of large language models (LLMs), aiming to effectively process and reason over long input documents, as large as upto 128K tokens (ie. ~200 pages if a book). Long-context large language models (LC LLMs) promise to increase reliability of LLMs in real-world tasks. Most model provider benchmarks champion the idea that LC LLMs are getting better and smarter with time. However, these claims are far from perfect in real-world applications.

In this session, we evaluate the performance of state-of-the-art GPT-4 suite of LC LLMs in solving a series of progressively challenging tasks on a real-world financial news dataset, using an improvised version of the popular “needle-in-a-haystack” paradigm. We see that leading LC LLMs exhibit brittleness at longer context lengths even for simple tasks, with performance deteriorating sharply as task complexity increases. At longer context lengths, these state-of-the-art models experience catastrophic failures in instruction following resulting in degenerate outputs. Prompt ablations also expose the unfortunate continued sensitivity to both the placement of the task instruction in the context window as well as minor markdown formatting.

Overall, we will address the following questions in our session:
1. Does performance depend on the choice of prompting?
2. Can models reliably use their full context?
3. Does performance depend on the complexity of the underlying task?

What You’ll Learn
Key learnings:
1. Complete understanding of the popular “Needle-in-a-Haystack” paradigm
2. Learning the shortcomings of the traditional “Needle-in-a-Haystack” setup and how to improve it for real-world applications
3. Can state-of-the-art long-context LLMs truly use their full context window effectively?
4. Are models able to perform equally well at both short-context and long-context tasks?

Talk: Unleashing the Algorithm Genie: AI as the Ultimate Inventor

Presenter:
Jepson Taylor, Former Chief AI Strategist DataRobot & Dataiku, VEOX Inc

About the Speaker:
Jepson is a popular speaker in the AI space having been invited to give AI talks to companies like Space X, Red Bull, Goldman Sachs, Amazon, and various branches of the US government. Jepson’s applied career has covered semiconductor, quant finance, HR analytics, deep-learning startup, and AI platform companies. Jepson co-founded and sold his deep-learning company Zeff.ai to DataRobot in 2020 and later joined Dataiku as their Chief AI Strategist. Jepson is currently launching a new AI company focused on the next generation of AI called VEOX Inc.

Talk Track: Research or Advanced Technical

Talk Technical Level: 3/7

Talk Abstract:
Prepare to have your understanding of AI capabilities turned upside down. Jepson Taylor presents groundbreaking advancements in the field of generative algorithms, where AI systems now possess the ability to invent and optimize their own algorithms. This talk explores how adaptive workflows can produce thousands of novel solutions daily, effectively automating the role of the AI researcher. Through engaging demonstrations, attendees will explore the vast potential of this technology to accelerate innovation across all sectors. Discover how these self-evolving systems are set to redefine the boundaries of what’s possible in technology and learn how you can start incorporating these concepts into your own work.

What You’ll Learn
Cutting-edge advancements in multi-agent systems and their role in driving AI innovation. The paradigm shift from prompt engineering to goal engineering in AI development. The power and potential of bespoke algorithms versus general-purpose solutions. How generative algorithms are revolutionizing the field of AI research and development. Practical insights into implementing automated innovation systems for rapid solution generation. Strategies for integrating self-evolving AI systems into various industry applications. Real-world examples and case studies of generative algorithms in action.

Talk: Build with Mistral

Presenter:
Sophia Yang, Head of Developer Relations, Mistral AI

About the Speaker:
Sophia Yang is the Head of Developer Relations at Mistral AI, where she leads developer education, developer ecosystem partnerships, and community engagement. She is passionate about the AI community and the open-source community, and she is committed to empower their growth and learning. She holds an M.S. in Computer Science, an M.S. in Statistics, and a Ph.D. in Educational Psychology from The University of Texas at Austin.

Talk Track: Advanced Technical/Research

Talk Technical Level: 1/7

Talk Abstract:
In the rapidly evolving landscape of Artificial Intelligence (AI), open source and openness AI have emerged as crucial factors in fostering innovation, transparency, and accountability. Mistral AI’s release of the open-weight models has sparked significant adoption and demand, highlighting the importance of open-source and customization in building AI applications. This talk focuses on the Mistral AI model landscape, the benefits of open-source and customization, and the opportunities for building AI applications using Mistral models.

What You’ll Learn
TBA

Talk: Agentic AI: Learning Iteratively, Acting Autonomously

Presenter:
Fatma Tarlaci, CTO, Rastegar Capital

About the Speaker:
Dr. Fatma Tarlaci is a distinguished engineering leader with a wealth of experience in artificial intelligence, specializing in natural language processing. As the Chief Technology Officer at Rastegar Capital, she possesses a unique combination of technical expertise and leadership skills that distinguish her in the industry. She applies advanced AI solutions in her work that enhance business strategies and operational efficiencies through the integration of cutting-edge technologies. Her approach combines her deep expertise in AI with practical applications to solve complex challenges in the industry. Before entering the industry, she conducted research at OpenAI and taught in academia. Demonstrating her commitment to education, she currently also teaches as an Adjunct Assistant Professor of Computer Science at UT Austin. Her interdisciplinary background provides her with a refined ability to navigate diverse professional environments effectively. She mentors at several organizations and serves as an advisor to startups.

Talk Track: Advanced Technical or Research

Talk Technical Level: 3/7

Talk Abstract:
AI agents are advanced software systems capable of autonomous actions and decision-making to achieve specific goals. Built on sophisticated machine learning models, they can process and respond to dynamic data inputs in real-time. Unlike traditional large language models (LLMs) that generate outputs in a single attempt (zero-shot mode), agentic AI introduces an iterative, dynamic workflow. These workflows often involve multiple stages—planning, data gathering, drafting, assessment, and revision—significantly improving the quality of outcomes. This process mirrors human learning, where continual refinement leads to better results. Notably, iterative agentic workflows have recently shown impressive performance in tasks like coding, outperforming standard models on benchmarks such as HumanEval.

This presentation will provide an in-depth analysis of the architectures that underpin agentic AI, explore the cutting-edge technologies enabling their capabilities, and delve into their practical applications. It will also address key challenges in the field, such as scalability and ethical considerations, while exploring future directions. Attendees will gain a thorough understanding of AI agents, their current uses, and their potential to transform industries.

What You’ll Learn:
Attendees will learn about the fundamental architectures and technologies underpinning AI agents, their real-world applications, and the challenges they face, including ethical issues and scalability. The session will also explore future trends in AI development and its potential impact across various sectors.

Talk: Fast and Reproducible: Taming AI/ML Dependencies

Presenter:
Savin Goyal, Co-founder & CTO, Outerbounds

About the Speaker:
Savin is the co-founder and CTO of Outerbounds – where his team is building the modern ML stack to accelerate the impact of data science. Previously, he was at Netflix, where he built and open-sourced Metaflow, a full stack framework for data science.

Talk Track: Advanced Technical/Research

Talk Technical Level: 3/7

Talk Abstract:
Careful management of software dependencies is one of the most underrated parts of ML and AI systems despite being critically important for the stability of production deployments as well as for the speed of development. For the past many years, we have worked with the wider Python package management community (pip, conda, rattler, uv, and many more) and multiple organizations (Netflix, Amazon, Goldman Sachs, and many more) to advance the state of the art in dependency management for ML/AI platforms, including our open-source framework Metaflow.

In this talk, we’ll explore common pitfalls in dependency management and their impact on ML projects, from unexpected results due to package changes to the challenges of reproducing environments across different machines. We’ll cover issues ranging from the complexities of scaling dependencies in distributed cloud environments to performance regressions from seemingly innocuous updates, highlighting why robust dependency management is crucial for production ML systems.

We’ll share our learnings and demonstrate how we address key challenges in building robust and maintainable ML systems, such as:
Creating fast, stable, and reproducible environments for quick experimentation
Scaling cross-platform execution to the cloud with automatic dependency handling
Auditing the full software supply chain for security and compliance

We’ll also demo some of our recent work which enables baking very large container images in just a few seconds, significantly accelerating the prototyping and experimentation cycle for ML practitioners.

What You’ll Learn
This talk explores the critical yet often overlooked role of software dependency management in ML and AI systems. Drawing from years of collaboration with the Python package management community and major organizations, the speakers will share insights on common pitfalls in dependency management and their impact on ML projects and recent innovations in rapid container image creation for accelerated ML experimentation.

Talk: Revolutionizing Cloud Storage: From Petabytes to Intelligence

Presenter:
Vinit Dhatrak, Lead Software Engineer, DocuSign

About the Speaker:
Vinit is a seasoned software engineer with a demonstrated history of building on-premise and cloud-native distributed systems at scale. Currently, Vinit serves as a Lead Software Engineer at DocuSign, contributing to the Docusign’s Storage team. With expertise encompassing cloud storage, distributed systems, and virtualization technologies such as Kubernetes, Docker, and the Linux Kernel, Vinit stands out as a thought leader in the tech industry. Throughout his career, Vinit has held pivotal roles at notable companies like Google, Box, Commvault, and Marvell, where he played an instrumental role in developing highly scalable and distributed cloud storage solutions. His proficiency in object-oriented design and systems programming, coupled with his capability to scale infrastructures to handle concurrent requests and planet-scale storage, positions him as a true expert in his field. Vinit is an alumnus of the Georgia Institute of Technology, where he earned a Master’s degree in Computer Science. His technical acumen and leadership capabilities are evident in his ability to mentor peers and collaborate effectively with industry leaders. Recognized for his impactful contributions, Vinit frequently engages with the tech community, participating in conferences. His dedication to advancing technological solutions is not only evident in his professional experiences but also in his commitment to ongoing learning and development exemplified by his educational achievements and professional accomplishments. Stay connected with Vinit through his LinkedIn profile to gain insights from his extensive knowledge of scalable design and distributed systems, as he continues to innovate and lead in the ever-evolving landscape of technology.

Talk Track: Advanced Technical/Research

Talk Technical Level: 6/7

Talk Abstract:
In an era driven by exponential data growth and the need for intelligent insights, cloud storage solutions must evolve to meet the dynamic demands of modern enterprises. This talk delves into the intricate process of migrating traditional on-premises blob storage systems to cutting-edge cloud platforms like Azure while integrating AI-powered insights to enhance cloud software offerings.

We will explore the architecture and implementation challenges faced while leading the Blob Storage team at DocuSign, focusing on how AI technologies were harnessed to transform data management within the Intelligent Agreement Management (IAM) platform. This initiative did not just facilitate a seamless transition but also pioneered a new category in cloud software, significantly bolstering market leadership.

Attendees will gain insights into optimizing resource utilization, achieving cost efficiency, and ensuring scalability in cloud migrations, drawing from a successful case implementing an intelligence-enabled cloud ecosystem. Furthermore, the talk will illuminate how AI and machine learning models were leveraged to provide actionable insights, assisting in strategic decision-making and enhancing user engagement.

The session will cover critical lessons learned, including identity and data security in cloud transformations, effective use of REST APIs for integration, and the deployment of microservices for agile and scalable services. Participants will leave equipped with advanced strategies to align their cloud migration efforts with organizational goals, optimize resources, and drive innovation through AI. Join us as we explore the convergence of cloud and artificial intelligence, unlocking new potentials in data storage solutions.

What You’ll Learn:
You’ll learn how to migrate on-premises blob storage to cloud platforms like Azure, focusing on DocuSign’s experience. We’ll explore the architectural and implementation challenges, highlighting how AI-powered insights were integrated into the Intelligent Agreement Management (IAM) platform. The talk will cover optimizing resource utilization and achieving cost efficiency during cloud migrations, using successful case studies. you’ll gain practical strategies for aligning cloud migrations with organizational goals, fostering innovation through AI-driven insights, and maximizing user engagement.

Talk: Optimizing AI/ML Workflows on Kubernetes: Advanced Techniques and Integration

Presenter:
Anu Reddy, Senior Software Engineer, Google

About the Speaker:
Anu is a senior software engineer working on optimizing Google Kubernetes Engine for techniques like RAG and supporting popular AI/ML framework and tools such as Ray.

Talk Track: Advanced Technical/Research

Talk Technical Level: 6/7

Talk Abstract:
Explore advanced technical strategies for optimizing AI/ML workflows on Kubernetes, the world’s leading open-source container orchestration platform. This session will cover techniques for integrating open-source AI tools, wide range of workflows, including training, inference, prompt engineering (RAG, agent), managing multi-cluster environments, and ensuring cost-effective resource utilization. Participants will gain deep insights into how Kubernetes supports flexible and scalable AI/ML infrastructure, with specific examples of using Kubernetes-native tools like Kueue for job queuing and Ray for distributed computing. The session will also highlight the use of NVIDIA GPUs, TPUs, and advanced workload management strategies, with Google Kubernetes Engine (GKE) as an illustrative example.

Foundation models (e.g. large language models) create exciting new opportunities in our longstanding quests to produce open-ended and AI-generating algorithms, wherein agents can truly keep innovating and learning forever. In this talk I will share some of our recent work harnessing the power of foundation models to make progress in these areas. I will cover our recent work on OMNI (Open-endedness via Models of human Notions of Interestingness), Video Pre-Training (VPT), Thought Cloning, Automatically Designing Agentic Systems, and The AI Scientist.

What You’ll Learn
– Advanced techniques for optimizing AI/ML workflows on Kubernetes
– Integration of open-source AI tools within Kubernetes environments
– Strategies for managing multi-cluster AI/ML deployments and optimizing resource utilization

Talk: Driving GenAI Success in Production: Proven Approaches for Data Quality, Context, and Logging

Presenter:
Alison Cossette, Developer Advocate, Neo4j

About the Speaker:
Alison Cossette is a dynamic Data Science Strategist, Educator, and Podcast Host. As a Developer Advocate at Neo4j specializing in Graph Data Science, she brings a wealth of expertise to the field. With her strong technical background and exceptional communication skills, Alison bridges the gap between complex data science concepts and practical applications. Alison’s passion for responsible AI shines through in her work. She actively promotes ethical and transparent AI practices and believes in the transformative potential of responsible AI for industries and society. Through her engagements with industry professionals, policymakers, and the public, she advocates for the responsible development and deployment of AI technologies. She is currently a Volunteer Member of the US Department of Commerce – National Institute of Standards and Technology’s Generative AI Public Working Group Alison’s academic journey includes Masters of Science in Data Science studies, specializing in Artificial Intelligence, at Northwestern University and research with Stanford University Human-Computer Interaction Crowd Research Collective. Alison combines academic knowledge with real-world experience. She leverages this expertise to educate and empower individuals and organizations in the field of data science. Overall, Alison Cossette’s multifaceted background, commitment to responsible AI, and expertise in data science make her a respected figure in the field. Through her role as a Developer Advocate at Neo4j and her podcast, she continues to drive innovation, education, and responsible practices in the exciting realm of data science and AI.

Talk Track: Advanced Technical/Research

Talk Technical Level: 2/7

Talk Abstract:
Generative AI is a part of our every day work now, but folks are still struggling to realize business value in production.

Key Themes:

Methodical Precision in Data Quality and Dataset Construction for RAG Excellence: Uncover an integrated methodology for refining, curating, and constructing datasets that form the bedrock of transformative GenAI applications. Specifically, focus on the six key aspects crucial for Retrieval-Augmented Generation (RAG) excellence.

Navigating Non-Semantic Context with Awareness: Explore the infusion of non-semantic context through graph databases while understanding the nuanced limitations of the Cosine Similarity distance metric. Recognize its constraints in certain contexts and the importance of informed selection in the quest for enhanced data richness.

The Logging Imperative: Recognize the strategic significance of logging in the GenAI landscape. From application health to profound business insights, discover how meticulous logging practices unlock valuable information and contribute to strategic decision-making.

Key Takeaways:

6 Requirements for GenAI Data Quality

Adding non-semantic context, including an awareness of limitations in distance metrics like Cosine Similarity.

The strategic significance of logging for application health and insightful business analytics.

Join us on this methodologically rich exploration, “Beyond Vectors,” engineered to take your GenAI practices beyond the current Vector Database norms, unlocking a new frontier in GenAI evolution with transformative tools and methods!

What You’ll Learn
TBA

Talk: The State-of-the-art in Software Development Agents

Presenter:
Graham Neubig, Associate Professor / Chief Scientist, Carnegie Mellon University / All Hands AI

About the Speaker:
Graham Neubig is an associate professor at Carnegie Mellon University, focusing on machine learning methods for natural language processing, code generation, and AI agents. He is also co-founder and chief scientist at All Hands AI, a company building open-source software development agents.

Talk Track: Advanced Technical/Research

Talk Technical Level: 4/7

Talk Abstract:
One of the most exciting application areas of AI is software development, and gradually we are moving towards more autonomous development where AI agents can perform software development tasks end-to-end. In this talk I will describe the latest work on AI software developers, including research developments, evaluation methods, and challenges that agents currently face. I will also discuss some of the MLOps challenges involved in deploying these agents, including safety and efficiency concerns.

What You’ll Learn:

  • Some of the latest developments in software-related AI
  • Representative software products at different levels of autonomy
  • AI modeling challenges involved in making highly accurate AI software engineers
  • MLOps challenges involved in deploying these software AI agents
Talk: LLMs Alone Do Not Solve Business Problems

Presenter:
Marinela Profi, Global AI & GenAI Lead, SAS

About the Speaker:
Marinela Profi is the Global AI & GenAI Lead at SAS. Leveraging her extensive background in data science, Marinela brings a unique perspective that bridges the realms of technology and marketing. She drives AI implementation within Banking, Manufacturing, Insurance, Government and Energy sectors. Marinela has a Bachelor’s in Econometrics, a Master of Science in Statistics and Machine Learning and Master’s in Business Administration (MBA). She enjoys sharing her journey on LinkedIn, and on the main stage, to help those interested in a career in data and tech.

Talk Track: Business Strategy

Talk Technical Level: 4/7

Talk Abstract:
The rapid rise of generative AI in 2023 sparked widespread experimentation, with companies across industries eager to leverage its potential for content generation, task automation, and customer experience transformation. However, not all organizations have found success, and by 2025, a clear divide will emerge between those excelling with AI-driven innovation and those struggling to keep up. In this session we will explore the key principles that are differentiating organizations who are winning the generative AI wave and successful use cases.

What You’ll Learn
Join me to learn the reasons behind this divide, mistakes to avoid and key factors and use cases driving success for companies leveling up with AI, so you can win too.

Talk: AI Features Demand Evidence-Based Decisions

Presenter:
Connor Joyce, Senior User Researcher, Microsoft and Author of “Bridging Intentions to Impact”

About the Speaker:
Connor Joyce is the author of “Bridging Intentions to Impact” and a Senior User Researcher on the Microsoft Copilot Team, where he is advancing the design of AI-enhanced features. Passionate about driving meaningful change, Connor advocates that companies adopt an Impact Mindset, ensuring that products not only change behavior to satisfy user needs but also drive positive business outcomes. He is a contributor to numerous publications, advises emerging startups, and lectures at the University of Pennsylvania. Based in Seattle, Connor enjoys exploring the outdoors with his dog, Chai, and a local event organizer.

Talk Track: Business Strategy

Talk Technical Level: 5/7

Talk Abstract:
We are in the midst of a technology paradigm shift, and there is significant pressure on product teams to build Generative AI (GenAI) into their products. Navigating these uncharted waters requires decisions based on a deep understanding of user needs to ensure that this new technology is leveraged in the most beneficial way for both users and the business. This presentation emphasizes the necessity of creating a demand for insights by product teams and the democratization of evidence creation. Doing both can be achieved by defining features in a way that highlights the evidence supporting why they should work. By using the novel User Outcome Connection, teams can naturally identify what data is known and unknown about a feature. This framework makes the pursuit of new research to fill the gaps more straightforward, ensuring a solid foundation for decision-making.

By developing User Outcome Connection frameworks for key features, teams can design solutions that appropriately and effectively incorporate GenAI. This will be showcased through B2B and B2C examples illustrating the practical application and transformative potential of this approach.

What You’ll Learn
Attendees will learn how using the User Outcome Connection framework for key features, enables the strategic use of GenAI where it truly adds value. By the end of this session, participants will be equipped with actionable steps to adopt evidence-based frameworks, ensuring their products meet the evolving demands of technology and user expectations. Join this session to learn how to navigate the AI paradigm shift with evidence-based decisions and design truly impactful AI-enhanced features.

Talk: Building Trust in AI Systems

Presenter:
Joseph Tenini, Principal Data Scientist, Universal Music Group

About the Speaker:
Joseph Tenini has worked in data science for over a decade in a variety of industries including healthcare, publishing, digital marketing, and entertainment. He has developed, deployed, and managed the lifecycle of a variety of ML-enabled products in many different settings. His specific expertise lies in recommender systems, reinforcement learning, and process improvement. He holds a PhD in Mathematics from the University of Georgia.

Talk Track: Business Strategy

Talk Technical Level: 3/7

Talk Abstract:
As builders of AI and ML systems, much time and effort is spent in building our own trust with the technology we are developing. This can take the form of model accuracy metrics, compute efficiency, and core functionality achieved. There is another, often more daunting, step to be considered: building trust in the technology with non-technical users and other stakeholders who will be impacted by its adoption.

In this talk, we explore four pillars of building trust in AI systems with non-technical stakeholders:
1. Describing performance relative to an interpretable and intuitive baseline.
2. Quantifying uncertainty as part of the delivery process.
3. Sharing “the why” in non-binary decision processes.
4. Designing for 2nd order process effects.

After this talk, machine learning practitioners and managers will be equipped to build trust in the products they develop – enabling maximum value and impact from their work.

What You’ll Learn
TBA

Talk: AI in Financial Services: Emerging Trends and Opportunities

Presenter:
Awais Bajwa, Head of Data & AI Banking, Bank of America

About the Speaker:
TBA

Talk Track: TBA

Talk Technical Level: 3/7

Talk Abstract:
TBA

What You’ll Learn
TBA

Talk: MLOps for AgenticAI: How to Manage Agents in Production

Presenter:
Eero Laaksonen, CEO & Founder, Valohai

About the Speaker:
Serial entrepreneur hippie with a keen interest in making the world a better place. I believe people should work less and enjoy life more. Currently escalating the adoption of machine learning in enterprises around the world with Valohai.

Talk Track: Business Strategy

Talk Technical Level: 3/7

Talk Abstract:
AI Agents are taking over (and for a good reason). There’s infinite yet untapped potential for everyone, from enterprises to startups, working on everything, from supporting internal operations to shipping user-facing features. New players emerge specifically to offer AI Agents as a service, often catering to specific industries.

However, very few have succeeded in getting their AI Agents to production and generating value from them. One of the main reasons is the complex infrastructure and MLOps best practices that must be in place from day one.

What You’ll Learn
– How to build the foundation for future-proofing the success of proprietary AI Agents
– The trade offs in MLOps stacks and AI infrastructure for in the AgenticAI space
– How to manage AI Agents in production and maximize the Return-on-Investment

Talk: Code Generation Agents: Architecture, Data Modeling Challenges, and Production-Ready Considerations

Presenter:
Lee Twito, GenAI Lead, Lemonade

About the Speaker:
TBD

Talk Track: Case Study / Business Strategy

Talk Technical Level: 5/7

Talk Abstract:
This session provides an in-depth look at the architecture of multi-agent systems for code generation, emphasizing practical solutions to common data modeling challenges. We’ll explore how focusing on signature data, employing linters, indexing codebases, and utilizing GraphRAG can address issues often faced in agent-driven coding environments. We’ll examine examples where models fall short, and show new concepts that resolve these challenges. You’ll leave with a practical understanding of how multi-agent systems are structured, as well as insights on minimizing bugs and enhancing reliability.

What You’ll Learn
You’ll leave with a deep understanding of how advanced multi-agent systems for code generation operate, including solutions to common data modeling challenges. You’ll also discover practical techniques—like using linters, codebase indexing, and graph retrieval-augmented generation—that you can apply to make your own agent systems more reliable and efficient.

Talk: Measuring the Minds of Machines: Evaluating Generative AI Systems

Presenter:
Jineet Doshi, Staff Data Scientist/AI Lead, Intuit

About the Speaker:
Jineet Doshi is an award winning AI Lead and Engineer with over 7 years of experience. He has a proven track record of leading successful AI projects and building machine learning models from design to production across various domains, which have impacted millions of customers and have significantly improved business metrics, leading to millions of dollars of impact. He is currently an AI Lead at Intuit where he is one of the architects of their Generative AI platform which was featured on Forbes and Wall Street.

Jineet has also delivered guest lectures at Stanford University and UCLA on Applied AI. He is on the Advisory Board of University of San Francisco’s AI Program. He holds multiple patents in the field, has advised numerous AI startups and has also co chaired workshops at top AI conferences like KDD.

Talk Track: Case Study

Talk Technical Level: 3/7

Talk Abstract:
Evaluating LLMs is essential in establishing trust before deploying them to production. Even post deployment, evaluation is essential to ensure LLM outputs meet expectations, making it a foundational part of LLMOps. However, evaluating LLMs remains an open problem. Unlike traditional machine learning models, LLMs can perform a wide variety of tasks such as writing poems, Q&A, summarization etc. This leads to the question how do you evaluate a system with such broad intelligence capabilities? This talk covers the various approaches for evaluating LLMs along with the pros and cons of each. It also covers evaluating LLMs for safety and security and the need to have a holistic approach for evaluating these very capable models.

What You’ll Learn
The audience will learn why evaluating GenAI systems is fundamental yet it remains an open problem, a broad overview of different techniques for evaluating GenAI systems (including some state of the art ones) along with pros and cons of each, how other ML Practicioners are doing LLM evals and techniques for evaluating for safety and security

Talk: From Silos to Synergy: MLOps & Developers Unified

Presenter:
Yuval Fernbach, VP & CTO, JFrog

About the Speaker:
Yuval Fernbach is the CTO of MLOps at JFrog, previously co-founder and CTO of Qwak.
With over a decade of experience in data and machine learning, Yuval led the creation of a user-friendly ML Platform that simplifies building, training, and deploying models. Before Qwak, he served as an ML Specialist at AWS, helping clients harness machine learning to drive business transformation. Yuval is passionate about using data and technology to foster innovation.

Talk Track: Advanced Technical/Research

Talk Technical Level: 6/7

Talk Abstract:
In the evolving landscape of software, machine learning is no longer an isolated component it’s an integral part of the entire secure software supply chain. For ML engineers, this shift presents an exciting opportunity to go beyond experimentation and model development to actively contribute to the secure and scalable delivery of AI solutions. This talk will explore how unifying MLOps with traditional software development processes enhances security, streamlines deployments, and enables ML models to be part of the broader company-wide deployment strategy. Learn how becoming part of the overall software supply chain can empower you to make a bigger impact, ensuring your models reach production safely and effectively, while aligning with global deployment standards.

What You’ll Learn:
TBD

Talk: Enabling Safe Enterprise Adoption of Generative AI

Presenter:
John Hearty, Head of AI Governance, Mastercard

About the Speaker:
TBD

Talk Track: Case Study

Talk Technical Level: 6/7

Talk Abstract:
At Mastercard, we have over a decade of experience leveraging AI, with a mature AI Governance program that provides oversight, and enables the fair, effective, and transparent use and development of AI solutions. However, Generative AI has brought new challenges and risks, which have made us rethink our processes.

What You’ll Learn:
We will discuss Mastercard’s journey to setting up our AI Governance Program, and how we’ve adapted it to meet the demands of emerging technology.
We will also discuss about
– How we have operationalized responsible AI development,
– The new possibilities that Generative AI brings, as well as the challenges and how we have adapted to them,
– Ways of leveraging this technology in a safe and effective way
– Lessons learned from a relatively small team to enable a major enterprise (the importance of strategic partnerships!)
– Scaling enterprise-wide adoption of consistent governance frameworks and risk management techniques for GenAI, focusing on process and scale

Talk: On-Device ML for LLMs: Post-training Optimization Techniques with T5 and Beyond

Presenter:
Sri Raghu Malireddi, Senior Machine Learning Engineer, Grammarly

About the Speaker:
Sri Raghu Malireddi is a Senior Machine Learning Engineer at Grammarly, working on the On-Device Machine Learning. He specializes in deploying and optimizing Large Language Models (LLMs) on-device, focusing on improving system performance and algorithm efficiency. He has played a key role in the on-device personalization of the Grammarly Keyboard. Before joining Grammarly, he was a Senior Software Engineer and Tech Lead at Microsoft, working on several key initiatives for deploying machine learning models in Microsoft Office products.

Talk Track: Advanced Technical/Research

Talk Technical Level: 4/7

Talk Abstract:
This session explores the practical aspects of implementing Large Language Models (LLMs) on devices, focusing on models such as T5 and its modern variations. Deploying ML models on devices presents significant challenges due to limited computational resources and power constraints. However, On-Device ML is crucial as it reduces dependency on cloud services, enhances privacy, and lowers latency.

Optimizing LLMs for on-device deployment requires advanced techniques to balance performance and efficiency. Grammarly is at the forefront of On-Device ML, continuously innovating to deliver high-quality language tools. This presentation offers valuable insights for anyone interested in the practical implementation of on-device machine learning using LLMs, drawing on Grammarly’s industry application insights.

The topics that will be covered as part of this talk are –
– Techniques for optimizing performance and reducing inference latency in LLMs – Quantization, Pruning, Layer Fusion, etc.
– Methods to develop efficient and scalable AI solutions on edge devices.
– Addressing common challenges in deploying LLMs to edge devices – over-the-air updates, logging, and debugging issues in production.

Foundation models (e.g. large language models) create exciting new opportunities in our longstanding quests to produce open-ended and AI-generating algorithms, wherein agents can truly keep innovating and learning forever. In this talk I will share some of our recent work harnessing the power of foundation models to make progress in these areas. I will cover our recent work on OMNI (Open-endedness via Models of human Notions of Interestingness), Video Pre-Training (VPT), Thought Cloning, Automatically Designing Agentic Systems, and The AI Scientist.

What You’ll Learn
TBA

Talk: Code Generation Agents: Architecture, Data Modeling Challenges, and Production-Ready Considerations

Presenters:
Lee Twito, GenAI Lead, Lemonade | Alon Gubkin, Co-Founder & CTO, Aporia

About the Speaker:
In 2019, Alon Gubkin cofounded Aporia, the ML observability platform. Aporia is trusted by Fortune 500 companies and data science teams in every industry to ensure responsible AI and monitor, improve, and scale ML models in production. Alon, an ex-R&D team lead in the elite Unit 81 intelligence unit of the Israel Defense Forces, has led Aporia in raising $30 million from investors like Tiger Global Management and Samsung Next. For two years in a row, 2022 and 2023, Alon was named to Forbes 30 Under 30.

Talk Track: Case Study

Talk Technical Level: 5/7

Talk Abstract:
This session provides an in-depth look at the architecture of multi-agent systems for code generation, emphasizing practical solutions to common data modeling challenges. We’ll explore how focusing on signature data, employing linters, indexing codebases, and utilizing GraphRAG can address issues often faced in agent-driven coding environments. We’ll examine examples where models fall short, and show new concepts that resolve these challenges. You’ll leave with a practical understanding of how multi-agent systems are structured, as well as insights on minimizing bugs and enhancing reliability.

What You’ll Learn:
You’ll leave with a deep understanding of how advanced multi-agent systems for code generation operate, including solutions to common data modeling challenges. You’ll also discover practical techniques—like using linters, codebase indexing, and graph retrieval-augmented generation—that you can apply to make your own agent systems more reliable and efficient.

Talk: Demystifying Multi-Agent Patterns

Presenter:
Pablo Salvador Lopez, Principal AI Architect, Microsoft

About the Speaker:
As a seasoned engineer with extensive experience in AI and machine learning, I possess a unique blend of skills in full-stack data science, machine learning, and software engineering, complemented by a solid foundation in mathematics. My expertise lies in designing, deploying, and monitoring AI/ML software products at scale, adhering to MLOps/LLMOps and best practices in software engineering.

Having previously led the MLOps practice at Concentrix Catalyst and the ML Engineering global team at Levi Strauss & Co., I have developed a profound understanding of implementing real-time and batch time ML solutions for several Fortune 500 enterprises. This experience has significantly enhanced my ability to manage big data and leverage cloud engineering, particularly with Azure’s AI, GCP and AWS.

Currently, at Microsoft, as a Principal Technical Member of the prestigious AI Global Black Belt team, I am dedicated to empowering the world’s largest enterprises with cutting-edge generative AI and machine learning solutions. My role involves driving transformative outcomes through the adoption of the latest AI technologies and demystifying the most complex architectural and development patterns. Additionally, I am actively involved in shaping the industry’s direction in LLMOps and contributing to open source by publishing impactful software and AI solutions.

Talk Track: Applied Case Studies

Talk Technical Level: 5/7

Talk Abstract:
How to successfully build and productionalize a multi-agent architecture with semantic-kernel and Autogen.

What You’ll Learn
The audience will learn how to build a multi-agent architecture following best practices using open-source technology like Semantic-Kernel and Autogen. This session will accelerate the journey from single-agent to multi-agent systems and how to productionize these systems to scale using best practices for LLMs in production.

Talk: Code Smarter, not harder: Generative AI in the Software Development Lifecycle

Presenter:
Keri Olson, VP Product Management, IBM

About the Speaker:
Keri Olson is the Vice President of Product Management for IBM AI for Code and Head of Product for IBM watsonx Code Assistant. She has over 20 years of experience in enterprise software and has held roles in Product Management, Engineering, Operations, Transformation, and Corporate Consulting. Keri is passionate about software and product development, AI for Code, driving innovation, and building strong technical and business partnerships. She is based in Rochester Minnesota, and she enjoys volunteering in the community as well as mentoring technical and business professionals to help build the next generation of leaders.

Talk Track: Applied Case Studies

Talk Technical Level: 6/7

Talk Abstract:
In this session we will dive into the quickly evolving landscape of AI coding assistants and how they play a pivotal role in the software development lifecycle (SDLC). Join us to learn how IBM watsonx Code Assistant serves as your AI-powered coding companion through examples and a live demo.

What You’ll Learn
How to use a generative AI code assistant (IBM watsonx Code Assistant) to generate, query and document your code – and much more. Learn how to use the same tools to modernize legacy code (in Java) to create a modern code-base fit for today’s needs.

Talk: Generative AI Infrastructure at Lyft

Presenter:
Konstantin Gizdarski, ML Engineering, Lyft

About the Speaker:
Konstantin is an engineer at Lyft where he has worked on expanding the company’s capabilities in machine learning. Originally from Bulgaria, Konstantin grew up in the San Francisco Bay Area and attended Northeastern in Boston as an undergraduate.

Talk Track: Case Study

Talk Technical Level: 4/7

Talk Abstract:
In this talk, we will present the Gen AI infrastructure stack at Lyft.

We will talk about what components we used that were already part of our ML platform to support AI applications such as:
– model training
– model serving

Next, we will talk about some of the novel AI related components we built:
– AI vendor gateway
– custom clients
– LLM evaluation
– PII preserving infrastructure

Finally, we will share one or two use-cases that have been utilizing Gen AI at Lyft.

What You’ll Learn
You will learn how to evolve an ML Platform into an AI Platform.

Talk: GenAI ROI: From Pilot to Profit

Presenter:
Ilyas Lyoob, Faculty, University of Texas; Head of Research, Kyndryl; Venture Partner, Clutch VC

About the Speaker:
Dr. Ilyas Iyoob is faculty of Data Science and Artificial Intelligence in the Cockrell School of Engineering at the University of Texas. He pioneered the seamless interaction between machine learning and operations research in the fields of autonomous computing, health-tech, and fin-tech. Previously, Dr. Iyoob helped build a cloud computing AI startup and successfully sold it to IBM. He currently advises over a dozen venture funded companies and serves as the Global Head of Research at Kyndryl (IBM Spinoff). He has earned a number of patents and industry recognition for applied Artificial Intelligence and was awarded the prestigious World Mechanics prize by the University of London.

Talk Track: Case Study

Talk Technical Level: 1/7

Talk Abstract:
In this session, we will dive deep into the real-world ROI of Generative AI moving beyond pilot projects and into scalable, value-driving solutions. With real world examples of our enterprise implementations, we reveal the hidden costs, unexpected value, and key metrics that truly matter when measuring success. We will also explore practical steps to overcome “pilot paralysis” and strategies for balancing innovation with cost control.

What You’ll Learn
Whether you’re a decision-maker or AI leader, this session will provide actionable insights on how to make GenAI work for your business, ensuring it delivers measurable impact and not just hype.

Talk: Scaling Vector Database Usage Without Breaking the Bank: Quantization and Adaptive Retrieval

Presenter:
Zain Hasan, Senior ML Developer Advocate, Weaviate

About the Speaker:
Zain Hasan is a Senior Developer Advocate at Weaviate an open-source vector database. He is an engineer and data scientist by training, who pursued his undergraduate and graduate work at the University of Toronto building artificially intelligent assistive technologies. He then founded his company developing a digital health platform that leveraged machine learning to remotely monitor chronically ill patients. More recently he practiced as a consultant senior data scientist in Toronto. He is passionate about open-source software, education, community, and machine learning and has delivered workshops and talks at multiple events and conferences.

Talk Track: Case Study

Talk Technical Level: 3/7

Talk Abstract:
Everybody loves vector search and enterprises now see its value thanks to the popularity of LLMs and RAG. The problem is that prod-level deployment of vector search requires boatloads of both CPU, for search, and GPU, for inference, compute. The bottom line is that if deployed incorrectly vector search can be prohibitively expensive compared to classical alternatives.

The solution: quantizing vectors and performing adaptive retrieval. These techniques allow you to scale applications into production by allowing you to balance and tune memory costs, latency performance, and retrieval accuracy very reliably.

I’ll talk about how you can perform realtime billion-scale vector search on your laptop! This includes covering different quantization techniques, including product, binary, scalar and matryoshka quantization that can be used to compress vectors trading off memory requirements for accuracy. I’ll also introduce the concept of adaptive retrieval where you first perform cheap hardware-optimized low-accuracy search to identify retrieval candidates using compressed vectors followed by a slower, higher-accuracy search to rescore and correct.

These quantization techniques when used with well-thought-out adaptive retrieval can lead to a 32x reduction in memory cost requirements at the cost of ~ 5% loss in retrieval recall in your RAG stack.

What You’ll Learn
TBA

Talk: Multimodal LLMs for Product Taxonomy at Shopify

Presenter:
Kshetrajna Raghavan, Senior Staff ML Engineer, Shopify

About the Speaker:
With over 12 years of industry experience spanning healthcare, ad tech, and retail, Kshetrajna Raghavan has spent the last four years at Shopify building cutting-edge machine learning products that make life easier for merchants. From Product Taxonomy Classification to Image Search and Financial Forecasting, Kshetrajna has tackled a variety of impactful projects. Their favorite? The Product Taxonomy Classification model, a game-changer for Shopify’s data infrastructure and merchant tools.

Armed with a Master’s in Operations Research from Florida Institute of Technology, Kshetrajna brings a robust technical background to the table.

When not diving into data, Kshetrajna loves jamming on guitars, tinkering with electric guitar upgrades, hanging out with two large dogs, and conquering video game worlds.

Talk Track: Case Study

Talk Technical Level: 4/7

Talk Abstract:
At Shopify we fine-tune and deploy large vision language models in production to make millions of predictions a day, and leverage different open source tooling to achieve this.
In this talk we walkthrough how we went about doing it for a generative ai use case at Shopify’s scale.

What You’ll Learn
a. Getting to Know Vision Language Models:

The Basics: We’ll kick things off with a quick rundown of what vision language models are and how they work.
Cool Uses: Dive into some awesome ways these models are being used in e-commerce, especially at Shopify.
b. Fine-Tuning and Deployment:

Tweaking the Models: Learn the ins and outs of fine-tuning these big models for specific tasks.
Going Live: Tips and tricks for deploying these models so they can handle millions of predictions every day without breaking a sweat.
c. Open Source Tools:

Tool Talk: How to pick the right open-source tools for different stages of your model journey.
Smooth Integration: Real-life examples of how we fit these tools into our workflows at Shopify.
d. Scaling Up and Speeding Up:

Scaling Challenges: The hurdles we faced when scaling these models and how we jumped over them.
Speed Boosts: Techniques to keep things running fast and smooth in a production setting.
e. Generative AI Case Study:

Deep Dive: A step-by-step look at a specific generative AI project we tackled at Shopify, from start to finish.
Key Takeaways: What we learned along the way and how you can apply these lessons to your own projects.

Talk: Large Language Model Training and Serving at LinkedIn

Presenter:
Dre Olgiati, Distinguished Engineer, AI/ML, LinkedIn

About the Speaker:
Dre is a Distinguished Engineer at LinkedIn, where he leads wide-ranging initiatives relevant to large model training, serving, MLOps and more.

Talk Track: Case Study

Talk Technical Level: 4/7

Talk Abstract:
In this talk, Dre will describe some of the fundamental challenges and solutions faced by the LinkedIn team as they build innovative products based on LLMs and agents.

What You’ll Learn
How do I build scalable training and serving solutions for large language models (LLMs)? What are the challenges in scaling LLM training and serving?

Talk: Evolving with AI: Insights from Nyla's Generative AI Journey

Presenter:
Nadia Rauch, Senior Engineering Manager Intelligence, Nylas

About the Speaker:
Nadia is an experienced leader managing Nylas’ Machine Learning and Data Engineering teams, where she joined in 2020. With 14+ years of experience building products and working with data in multiple roles, she brings a wealth of expertise to her current position. Nadia holds a Bachelor’s and Master’s degree in Computer Engineering from the University of Florence and specializes in Knowledge Bases and Ontologies. During her doctoral studies, she developed a pioneering Smart City system, an innovative project which showcased her ability to harness data from diverse sources to create actionable insights for urban planning and management. Additionally, Nadia is a dedicated advocate for diversity in tech, serving as a committee member for TMLS – Women x AI, where she empowers young women entering the field.

Talk Track: Case Study

Talk Technical Level: 3/7

Talk Abstract:
In today’s rapidly evolving technological landscape, the integration of cutting-edge technologies such as Generative Artificial Intelligence (AI) presents both unprecedented opportunities and challenges for businesses. This presentation delves into the journey of Nylas in crafting and executing a strategic roadmap for the adoption of Generative AI within our organization.

Drawing upon real-world experiences and insights gleaned from our implementation process, we offer a firsthand account of the strategies, methodologies, and best practices that enabled us to seamlessly integrate Generative AI into our workflows while concurrently fulfilling existing customer commitments (in just 3 months!).

What You’ll Learn:
TBD

Talk: Toyota's Generative AI Journey

Presenter:
Ravi Chandu Ummadisetti, Generative AI Architect, Toyota

About the Speaker:
Ravi Chandu Bio (Generative AI Architect): Ravi Chandu Ummadisetti is a distinguished Generative AI Architect with over a decade of experience, known for his pivotal role in advancing AI initiatives at Toyota Motor North America. His expertise in AI/ML methodologies has driven significant improvements across Toyota’s operations, including a 75% reduction in production downtime and the development of secure, AI-powered applications. Ravi’s work at Toyota, spanning manufacturing optimization, legal automation, and corporate AI solutions, showcases his ability to deliver impactful, data-driven strategies that enhance efficiency and drive innovation. His technical proficiency and leadership have earned him recognition as a key contributor to Toyota’s AI success.

Kordel France Bio (AI Architect): Kordel brings a diverse background of experiences in robotics and AI from both academia and industry. He has multiple patents in advanced sensor design and spent much of the past few years founding and building a successful sensor startup that enables the sense of smell for robotics. He is on the board of multiple startups and continues to further his AI knowledge as an AI Architect at Toyota.

Eric Swei Bio (Senior Generative AI Architect): Boasting an impressive career spanning over two decades, Eric Swei is an accomplished polymath in the tech arena, with deep-seated expertise as a full stack developer, system architect, integration architect, and specialist in computer vision, alongside his profound knowledge in generative AI, data science, IoT, and cognitive technologies.

At the forefront as the Generative AI Architect at Toyota, Eric leads a formidable team in harnessing the power of generative AI. Their innovative endeavors are not only enhancing Toyota’s technological prowess but also redefining the future of automotive solutions with cutting-edge AI integration.

Stephen Ellis Bio (Technical Generative AI Product Manager): 10 years of experience in research strategy and the application of emerging technologies for companies as small as startups to Fortune 50 Enterprises. Former Director of the North Texas Blockchain Alliance where leading the cultivation of the Blockchain and Cryptocurrency competencies among software developers, C-level executives, and private investment advisors. Formerly the CTO of Plymouth Artificial Intelligence which was researching and developing future applications of AI. In this capacity advised companies on building platforms that seek to leverage emerging technologies for new business cases. Currently Technical Product Manager at Toyota Motors North America focused on enabling generative AI solutions for various group across the enterprise to drive transformation in developing new mobility solutions and enterprise operations.

Talk Track: Case Study

Talk Technical Level: 2/7

Talk Abstract:
Team Toyota will delve into their innovative journey with generative AI in automotive design, with the talk exploring how the Toyota research integrates traditional engineering constraints with state-of-the-art generative AI techniques, enhancing designers’ capabilities while ensuring safety and performance considerations.

What You’ll Learn
1. Toyota’s Innovation Legacy
2. Leveraging LLMs in Automotive – battery, vehicle, manufacturing, etc
3. Failures in Generative AI projects
4. Education to business stakeholders

Talk: Creating Our Own Private OpenAI API

Presenters:
Meryem Arik, Co-Founder & CEO, TitanML | Hannes Hapke, Principal Machine Learning Engineer, Digits

About the Speaker:
Meryem is the Co-founder and CEO of TitanML. She is a prominent advocate of Women in AI and a TedX speaker.

Hannes Hapke is a principal machine learning engineer at Digits, where he develops innovative ways to use machine learning to boost productivity for business owners and accountants. Prior to joining Digits, Hannes solved machine learning infrastructure problems in various industries including healthcare, retail, recruiting, and renewable energies.

Hannes actively contributes to TensorFlow’s TFX Addons project, has co-authored machine learning publications including the book on “Building Machine Learning Pipeline” and “Machine Learning Production Systems” by O’Reilly Media, and presented state-of-the-art ML work at conferences like ODSC 2022, or O’Reilly’s TensorFlow World.

Talk Track: Case Study

Talk Technical Level: 4/7

Talk Abstract:
Recent advancements in open-source large language models (LLMs) have positioned them as viable alternatives to proprietary models. However, the journey to deploying these open-source LLMs is fraught with challenges, particularly around infrastructure requirements and optimization strategies.

In this talk, Meryem Arik and Hannes Hapke will provide a detailed roadmap for startups and corporations aiming to deploy open-source LLMs effectively. Leveraging real-world examples, they will illustrate the practical steps and considerations essential for successful implementation. Additionally, they will share invaluable lessons learned from their own deployment experiences, offering attendees actionable insights to navigate the complexities of open-source LLM deployment.

What You’ll Learn:
Real world examples, lessons learned from deploying LLMs for 18+ months
Mixed deep technical insights from Meryem and team

Talk: Revolutionizing Venture Capital: Leveraging Generative AI for Enhanced Decision-Making and Strategic

Presenter:
Yuvaraj Tankala, AI Engineer and Venture Capital Innovator, Share Ventures

About the Speaker:
TBA

Talk Track: TBA

Talk Technical Level: 3/7

Talk Abstract:
TBA

What You’ll Learn
TBA

Talk: A Data Scientist Guide to Unit & End to End Testing

Presenter:
Vatsal Patel, Senior Data Scientist, MongoDB

About the Speaker:
Vatsal Patel, Senior Data Scientist, MongoDB

Talk Track: Workshop

Talk Technical Level: 4/7

Talk Abstract:
A comprehensive guide designed to equip data scientists with essential knowledge and practical skills for testing their developed and deployed models.

Key Topics Covered:
– Why Testing is Crucial in ML: Understand the importance of testing in the machine learning lifecycle and how it ensures model reliability and performance.

– Test-Driven Development: Learn about the TDD methodology, its benefits, and how it encourages writing clean, maintainable code by defining tests before implementing the functionality.

– Tools for Testing ML Models: Explore the tools available for unit testing, end-to-end testing, and CI/CD integration, such as `unittest`, `pytest`, `drone`, `GitHub Actions` etc.

– Unit Testing:
– Basic understanding of what unit testing is and its importance in verifying individual components of the ML pipeline.
– Best Practices: Identify best practices on how to approach testing, write test cases, and implement tests using popular frameworks.
– Tutorial: Practical examples illustrating how to write and run unit tests for data preprocessing functions and modelling.
– Running Tests: Instructions on running unit tests locally using `pytest`.

– End-to-End (E2E) Testing:
– Basic understanding of end-to-end testing, which validates the entire ML workflow from data ingestion to model serving.
Dependency Injection: Understand dependency injection and how it helps isolate components and create flexible test configurations.
– Best Practices: Best practices on defining workflows, writing test cases for critical paths, and implementing tests using E2E frameworks.
– Tutorial: Example of writing end-to-end tests for a scoring/training pipeline.
– Running Tests: Guidance on executing E2E tests locally using `pytest`.

Integration to CI/CD Pipelines: Learn how to automate model testing and deployment by integrating unit and end-to-end tests into CI/CD pipelines, ensuring continuous validation of code changes. This part will leverage Makefile & go over code coverage.

This presentation is ideal for data scientists, machine learning engineers, and anyone developing and deploying ML models who want to enhance their testing practices. By the end of the session, attendees will have a solid understanding of unit and end-to-end testing principles, practical examples to follow, and the confidence to implement these testing strategies in their projects.

What You’ll Learn:
TBA

Talk: Building Agentic and Multi-Agent Systems with LangGraph

Presenters:
Greg Loughnane, Co-Founder, AI Makerspace | Chris Alexiuk, Co-Founder & CTO, AI Makerspace

About the Speaker:
Dr. Greg Loughnane is the Co-Founder & CEO of AI Makerspace, where he is an instructor for their AI Engineering Bootcamp. Since 2021 he has built and led industry-leading Machine Learning education programs. Previously, he worked as an AI product manager, a university professor teaching AI, an AI consultant and startup advisor, and an ML researcher. He loves trail running and is based in Dayton, Ohio.

Chris Alexiuk is the Co-Founder & CTO at AI Makerspace, where he is an instructor for their AI Engineering Bootcamp. Previously, he was a Founding Machine Learning Engineer, Data Scientist, and ML curriculum developer and instructor. He’s a YouTube content creator YouTube who’s motto is “Build, build, build!” He loves Dungeons & Dragons and is based in Toronto, Canada.

Talk Track: Workshop

Talk Technical Level: 4/7

Talk Abstract:
2024 is the year of agents, agentic RAG, and multi-agent systems!

This year, people and companies aim to build more complex LLM applications and models; namely, ones that are ever-more capable of leveraging context and reasoning. For applications to leverage context well, they must provide useful input to the context window (e.g., in-context learning), through direct prompting or search and retrieval (e.g., Retrieval Augmented Generation or RAG. To leverage reasoning is to leverage the Reasoning-Action ReAct pattern, and to be “agentic” or “agent-like.” Another way to think about agents is that they enhance search and retrieval through the intelligent use of tools or services.

The best practice tool in the industry for building complex LLM applications is LangChain. To build agents as part of the LangChain framework, we leverage LangGraph, which allows us to bake in cyclical reasoning loops to our application logic. LangChain v0.2, the latest version of the leading infrastructure orchestration tooling, incorporates LangGraph directly, the engine that powers stateful (and even fully autonomous) agent cycles.

In this session, we’ll break down all the concepts and code you need to understand and build the industry-standard agentic and multi-agent systems, from soup to nuts.

What You’ll Learn
– A review of the basic prototyping patterns of GenAI, including Prompt Engineering, RAG, Fine-Tuning, and Agents
– The core ideas and constructs to build agentic and multi-agent applications with LangGraph
– ⛓️ Build custom agent applications with LangGraph
– 🤖 Develop multi-agent workflows with LangGraph

Talk: MLOps Template for Time Series in Production

Presenter:
Eddie Mattia, Data Scientist, Outerbounds

About the Speaker:
Data scientist at Outerbounds

Building AI developer tools and many applications on top of them!

Talk Track: Workshop

Talk Technical Level: 5/7

Talk Abstract:
In this session, we will build a complete MLOps platform that periodically retrains models and computes predictions in a batch inference pipeline. We’ll show how to build a time series forecasting machine with these properties and how the entire system can be deployed in the cloud.

What You’ll Learn:
How to frame time series forecasting problems for XGBoost.
How to build an end-to-end MLOps system.
How to trigger workflows in the cloud based on exogenous system events.

Talk: Hands-on Scalable Edge-to-Core ML Pipelines

Presenters:
Debadyuti Roy Chowdhury, VP Products, InfinyOn | Sehyo Chang, CTO, InfinyOn

About the Speaker:
Deb leads product management at InfinyOn a distributed streaming infrastructure company. Deb’s career since 2006 spans across IT, server administration, software and data engineering, leading data science and AI practices, and product management in HealthTech, Public Safety, Manufacturing, and Ecommerce.

Sehyo Chang is the CTO and Co-founder of InfinyOn. He is also the creator of the Fluvio open-source project. He dabbled in WASM technology at an early stage and spearheaded InfinyOn to join Bytecode Alliance. He is a veteran of the open-source business model. Previously he was at NGINX, where he developed nginmesh and Rust binding for NGINX.

Talk Track: Workshop

Talk Technical Level: 7/7

Talk Abstract:
In this intensive workshop, participants will gain hands-on experience in designing, implementing, and troubleshooting a real-world distributed ML pipeline that spans from edge devices to core infrastructure. We’ll tackle key MLOps challenges in building and managing complex, scalable systems for both operational analytics and AI/ML workflows.
Key topics covered:

Edge Computing: Simulating data ingestion from edge devices
Streaming Architecture: Implementing real-time data flows with open-source tools
Distributed Processing: Scaling ML workloads across heterogeneous environments
Model Deployment: Strategies for serving models at the edge and in the cloud
Observability and Monitoring: Setting up comprehensive monitoring for distributed ML systems
MLOps Best Practices: Applying DevOps principles to ML lifecycle management

Hands-on activities:

Participants will work in small groups to build a complete edge-to-core ML pipeline
Each team will deploy a pre-trained model for real-time inference at the edge
Groups will implement data validation and model monitoring across the pipeline
Participants will troubleshoot common issues in distributed ML systems

What You’ll Learn
Attendees will gain hands-on experience in designing, implementing, and troubleshooting a real-world distributed dataflow spanning operational analytics and AI/ML pipelines.

– Practical experience in designing scalable, distributed ML architectures
– Understanding of MLOps challenges in edge-to-core systems
– Hands-on skills in deploying and monitoring ML models across diverse environments
– Strategies for optimizing performance and resource usage in complex ML pipelines
– Best practices for maintaining data quality and model accuracy in production systems

Talk: Evaluating LLM-Judge Evaluations: Best Practices

Presenter:
Aishwarya Naresh Reganti, Applied Scientist, Amazon

About the Speaker:
Aishwarya is an Applied Scientist in the Amazon Search Science and AI Org. She works on developing large scale graph-based ML techniques that improve Amazon Search Quality, Trust and Recommendations. She obtained my Master’s degree in Computer Science (MCDS) from Carnegie Mellon’s Language Technology Institute, Pittsburgh. Aishwarya has over 6+ years of hands-on Machine Learning experience and 20+ publications in top-tier conferences like AAAI, ACL, CVPR, NeurIPS, EACL e.t.c. She has worked on a wide spectrum of problems that involve Large Scale Graph Neural Networks, Machine Translation, Multimodal Summarization, Social Media and Social Networks, Human Centric ML, Artificial Social Intelligence, Code-Mixing e.t.c. She has also mentored several Masters and PhD students in the aforementioned areas. Aishwarya serves as a reviewer in various NLP and Graph ML conferences like ACL, EMNLP, AAAI, LoG e.t.c. She has had the opportunity of working with some of the best minds in both academia and industry through collaborations and internships in Microsoft Research, University of Michigan, NTU Singapore, IIIT-Delhi, NTNU-Norway, University of South Carolina e.t.c.

Talk Track: In-Person Workshop

Talk Technical Level: 5/7

Talk Abstract:
The use of LLM-based judges has become common for evaluating scenarios where labeled data is not available or where a straightforward test set evaluation isn’t feasible. However, this approach brings the challenge of ensuring that your LLM judge is properly calibrated and aligns with your evaluation goals. In this talk, I will discuss some best practices to prevent what I call the “AI Collusion Problem,” where multiple AI entities collaborate to produce seemingly good metrics but end up reinforcing each other’s biases or errors. This creates a ripple effect.

What You’ll Learn
– Gain insight into what LLM judges are and the components that make them effective tools for evaluating complex use cases.
– Understand the AI Collusion problem in context of evaluation and how it can create a ripple effect of errors.
– Explore additional components and calibration techniques that help maintain the integrity and accuracy of evaluations.

Talk: Building a Multimodal RAG: A Step-by-Step Guide for AI/ML Practitioners

Presenter:
Ivan Nardini, Developer Relation Engineer, AI/ML, Google Cloud

About the Speaker:
Ivan Nardini is a Developer Relation Engineer on Google’s Cloud team, focusing on Artificial Intelligence and Machine Learning. He enables developers to build innovative AI and ML applications using their preferred libraries, models, and tools on Vertex AI, through code samples, online content, and events. Ivan has a master degree in Economics and Social Sciences from Università Bocconi and he attended a specialized training in Data Science from Barcelona Graduate School of Economics.

Talk Track: Workshop

Talk Technical Level: 5/7

Talk Abstract:
Learn to build a multimodal Retrieval Augmented Generation (RAG) system that goes beyond text, incorporating images, video, and audio. This hands-on session dives deep into the architecture and foundational principles of multimodal RAG, enabling you to leverage diverse data sources for enhanced information retrieval and extraction.

What You’ll Learn:
Foundational principles of multimodal RAG systems
How to design and implement key RAG components (data ingestion, parsing, chunking, retrieval, ranking, etc.) for multimodal RAG.
Best practices for evaluating RAG performance and mitigating hallucinations
Hands-on experience building a multimodal RAG system on Vertex AI (with provided cloud credits!)
Strategies for scaling your RAG system from prototype to MVP and beyond.
Gain insights into scaling Multimodal RAG for real-world applications.

Talk: Building Reliable AI: A Workshop to Build a Production Ready Multi-Modal Conversational Agent

Presenter:
Stefan Krawczyk, Co-Founder & CEO, DAGWorks Inc.

About the Speaker:
Stefan hails from New Zealand, speaks Polish, and completed his Masters at Stanford specializing in AI. He has spent over 15 years working across many parts of the stack, but has focused primarily on data and machine learning / AI related systems and their connection to building product applications. He has built many 0 to 1 and 1 to 3 versions of these systems at places like Stanford, Honda Research, LinkedIn, Nextdoor, Idibon, and Stitch Fix.

A regular conference speaker, Stefan has guest lectured at Stanford’s Machine Learning Systems Design course & Apps with LLMs Inside Course and is an author of two popular open source frameworks called Hamilton and Burr.

Stefan is currently co-founder and CEO of DAGWorks, where he’s building for the composable AI future that spans pipelines & agents with Hamilton & Burr.

Talk Track: Workshop

Talk Technical Level: 4/7

Talk Abstract:
Demoware is easy, but production grade? That’s hard.
In this session we’ll walkthrough how to build and iterate to a production grade multi-modal conversational agent.

Building a Reliable AI Agent is work, and we’ll walkthrough the concepts you’ll need to get there quickly and robustly. In this workshop we’ll use “first principles” where it makes sense, rather than using off-the-shelf giga-libraries, so you can more easily build a mental model of what you need in order to achieve a reliable result.

In this session we’ll build a fictional hotel concierge agent to help customers book rooms, update reservations, ask questions, and complain.

What You’ll Learn:
– How to structure, build, debug, and improve a multi-modal tool bot that uses tool calling to hit “business endpoints”
– An overview of MLOps/GenAIOps concepts required to build reliable AI by doing it
– When to use an off-the-shelf framework, and when to build it yourself
– How to use popular minimalistic frameworks like Burr, LanceDB, PyTest, etc (exact tool set confirmed before the workshop)
– A candidate software development lifecycle to do this well

Talk: Agentic Workflows in Cybersecurity

Presenter:
Dattaraj Rao, Chief Data Scientist, Persistent

About the Speaker:
TBA

Talk Track: TBA

Talk Technical Level: 3/7

Talk Abstract:
TBA

What You’ll Learn
TBA

Talk: Open-Ended and AI-Generating Algorithms in the Era of Foundation Models

Presenter:
Jeff Clune, Professor, Computer Science, University of British Columbia; CIFAR AI Chair, Vector; Senior Research Advisor, DeepMind

About the Speaker:
Jeff Clune is a Professor of computer science at the University of British Columbia, a Canada CIFAR AI Chair at the Vector Institute, and a Senior Research Advisor at DeepMind. Jeff focuses on deep learning, including deep reinforcement learning. Previously he was a research manager at OpenAI, a Senior Research Manager and founding member of Uber AI Labs (formed after Uber acquired a startup he helped lead), the Harris Associate Professor in Computer Science at the University of Wyoming, and a Research Scientist at Cornell University. He received degrees from Michigan State University (PhD, master’s) and the University of Michigan (bachelor’s). More on Jeff’s research can be found at JeffClune.com or on Twitter (@jeffclune). Since 2015, he won the Presidential Early Career Award for Scientists and Engineers from the White House, had two papers in Nature and one in PNAS, won an NSF CAREER award, received Outstanding Paper of the Decade and Distinguished Young Investigator awards, received two test of time awards, and had best paper awards, oral presentations, and invited talks at the top machine learning conferences (NeurIPS, CVPR, ICLR, and ICML). His research is regularly covered in the press, including the New York Times, NPR, the New Yorker, CNN, NBC, Wired, the BBC, the Economist, Science, Nature, National Geographic, the Atlantic, and the New Scientist.

Talk Track: Virtual Talk

Talk Technical Level: 3/7

Talk Abstract:
Open-Ended and AI-Generating Algorithms in the Era of Foundation Models

Foundation models (e.g. large language models) create exciting new opportunities in our longstanding quests to produce open-ended and AI-generating algorithms, wherein agents can truly keep innovating and learning forever. In this talk I will share some of our recent work harnessing the power of foundation models to make progress in these areas. I will cover our recent work on OMNI (Open-endedness via Models of human Notions of Interestingness), Video Pre-Training (VPT), Thought Cloning, Automatically Designing Agentic Systems, and The AI Scientist.

What You’ll Learn
TBA

Talk: Open-Ended and AI-Generating Algorithms in the Era of Foundation Models

Presenter:
Maxime Labonne, Senior Staff Machine Learning Scientist, Liquid AI

About the Speaker:
Maxime Labonne is a Senior Staff Machine Learning Scientist at Liquid AI, serving as the head of post-training. He holds a Ph.D. in Machine Learning from the Polytechnic Institute of Paris and is recognized as a Google Developer Expert in AI/ML.

An active blogger, he has made significant contributions to the open-source community, including the LLM Course on GitHub, tools such as LLM AutoEval, and several state-of-the-art models like NeuralBeagle and Phixtral. He is the author of the best-selling book “Hands-On Graph Neural Networks Using Python,” published by Packt.

Connect with him on X and LinkedIn.

Talk Track: Applied Case Studies

Talk Technical Level: 5/7

Talk Abstract:
Fine-tuning LLMs is a fundamental technique for companies to customize models for their specific needs. In this talk, we will introduce fine-tuning and best practices associated with it. We’ll explore how to create a high-quality data generation pipeline, discuss fine-tuning techniques using popular libraries, explain how model merging works, and present the best ways to evaluate LLMs.

What You’ll Learn
Best practices for fine-tuning, creating a high-quality data generation pipeline, fine-tuning techniques, best fine-tuning libraries, how to do model merging, and evaluation methods for fine-tuned models.

Talk: LLMidas' Touch; Safely Adopting GenAI for Production Use-Cases

Presenter:
Gon Rappaport, Solution Architect, Aporia

About the Speaker:
I’m a solution architect at Aporia. I joined just over two years ago. I’ve spent over eight years in the tech industry, starting from low-level programming and cybersecurity and transitioning to AI&ML.

Talk Track: Virtual Workshop

Talk Technical Level: 3/7

Talk Abstract:
During the session, we’ll explore the challenges of adopting GenAI in production use-cases. Through focus on the goal of using language models to solve more dynamic problems, we’ll address the dangers of “No-man’s-prod” and provide insights into safe and successful adoption. This presentation is designed for engineers, product managers and stakeholders and aims to provide a roadmap to release the first GenAI applications safely and successfully to production.

What You’ll Learn:

  • Become familiar with the potential issues of using generative AI in production applications
  • Learn how to mitigate the dangers of AI applications
  • Learn how to measure the performance of different AI application types
Talk: Hemm: Holistic Evaluation of Multi-modal Generative Models

Presenter:
Anish Shah, ML Engineer, Weights & Biases

About the Speaker:
Join Anish Shah for an in-depth session on fine-tuning and evaluating multimodal generative models. This talk will delve into advanced methodologies for optimizing text-to-image diffusion models, with a focus on enhancing image quality and improving prompt comprehension.
Learn how to leverage Weights & Biases for efficient experiment tracking, enabling seamless monitoring and analysis of your model’s performance.

Additionally, discover how to utilize Weave, a lightweight toolkit for tracking and evaluating LLM applications, to conduct practical and holistic evaluations of multimodal models.

The session will also introduce Hemm, a comprehensive library for benchmarking text-to-image diffusion models on image quality and prompt comprehension, integrated with Weights & Biases and Weave. By the end of this talk, you’ll be equipped with cutting-edge tools and techniques to elevate your multimodal generative models to the next level.

Talk Track: Virtual Workshop

Talk Technical Level: 3/7

Talk Abstract:
Join Anish Shah for an in-depth session on fine-tuning and evaluating multimodal generative models. This talk will delve into advanced methodologies for optimizing text-to-image diffusion models, with a focus on enhancing image quality and improving prompt comprehension.
Learn how to leverage Weights & Biases for efficient experiment tracking, enabling seamless monitoring and analysis of your model’s performance.

Additionally, discover how to utilize Weave, a lightweight toolkit for tracking and evaluating LLM applications, to conduct practical and holistic evaluations of multimodal models.

The session will also introduce Hemm, a comprehensive library for benchmarking text-to-image diffusion models on image quality and prompt comprehension, integrated with Weights & Biases and Weave. By the end of this talk, you’ll be equipped with cutting-edge tools and techniques to elevate your multimodal generative models to the next level.

What You’ll Learn:
Advanced Fine-Tuning Techniques: Explore methods for fine-tuning text-to-image diffusion models to enhance image quality and prompt comprehension.
Optimizing Image Quality: Understand the metrics and practices for assessing and improving the visual fidelity of generated images.
Enhancing Prompt Comprehension: Learn how to ensure your models accurately interpret and respond to complex textual prompts.
Utilizing Weights & Biases: Gain hands-on experience with Weights & Biases for tracking experiments, visualizing results, and collaborating effectively.
Leveraging Weave: Discover how Weave can be used for lightweight tracking and evaluation of LLM applications, providing practical insights into model performance.
Introduction to Hemm: Get acquainted with Hemm and learn how it facilitates comprehensive benchmarking of text-to-image diffusion models.
Holistic Model Evaluation: Learn best practices for conducting thorough evaluations of multimodal models, ensuring they meet desired performance standards across various metrics.

Talk: Serving GenAI Workload At Scale With LitServe

Presenter:
Aniket Maurya, Research Engineer, Lightning AI

About the Speaker:
I’m Aniket, a Machine Learning – Software Engineer with with over 4 years of experience, demonstrating a strong track record in developing and deploying machine learning models to production.

Talk Track: Virtual Workshop

Talk Technical Level: 5/7

Talk Abstract:
Learn about serving AI models with high throughput at scale. Dynamic batching, autoscaling and serve LLMs based complex workloads.

What You’ll Learn:
– Model serving in production
– Dynamic batching for high throughput
– Autoscaling
– Logging and monitoring in production

Talk: From Black Box to Mission Critical: Implementing Advanced AI Explainability and Alignment in FSIs

Presenter:
Vinay Kumar Sankarapu, Founder & CEO, Arya.ai

About the Speaker:
Vinay Kumar Sankarapu is the Founder and CEO of Arya.ai. He did his Bachelor’s and Master’s in Mechanical Engineering at IIT Bombay with research in Deep Learning and published his thesis on CNNs in manufacturing. He started Arya.ai in 2013, one of the first deep learning startups, along with Deekshith, while finishing his Master’s at IIT Bombay.

He co-authored a patent for designing a new explainability technique for deep learning and implementing it in underwriting in FSIs. He also authored a paper on AI technical debt in FSIs. He wrote multiple guest articles on ‘Responsible AI’, ‘AI usage risks in FSIs’. He presented multiple technical and industry presentations globally – Nvidia GTC (SF & Mumbai), ReWork (SF & London), Cypher (Bangalore), Nasscom(Bangalore), TEDx (Mumbai) etc. He was the youngest member of ‘AI task force’ set up by the Indian Commerce and Ministry in 2017 to provide inputs on policy and to support AI adoption as part of Industry 4.0. He was listed in Forbes Asia 30-Under-30 under the technology section.

Talk Track: Virtual Workshop

Talk Technical Level: 4/7

Talk Abstract:
In highly regulated industries like FSIs, there are more stringent policies regarding the use of ‘ML Models’ in production. To gain acceptance from all stakeholders, multiple additional criteria are required in addition to model performance.

This workshop will discuss the challenges of deploying ML and the stakeholders’ requirements in FSIs. We will review the sample setup in use cases like claim fraud monitoring and health claim processing, along with the case study details of model performance and MLOps architecture iterations.

The workshop will also discuss the AryaXAI MLObservability competition specifications and launch details.

What You’ll Learn:
In this workshop, you will gain a comprehensive understanding of the expectations of FSIs while deploying machine learning models. We’ll explore the additional criteria beyond model performance essential for gaining acceptance from various stakeholders, including compliance officers, risk managers, and business leaders. We’ll delve into how AI explainability outputs must be iterated for multiple stakeholders and how alignment is implemented through real-world case studies in claim fraud monitoring and health claim processing. You’ll also gain insights into why the iterative process of developing MLOps architectures is needed to meet performance and compliance requirements.

Talk: Building AI Applications as a Developer

Presenters:
Roy Derks, Technical Product Manager, IBM watsonx.ai | Alex Seymour, Technical Product Manager, IBM watsonx.ai

About the Speaker:
Roy Derks is a lifelong software developer, author and public speaker from the Netherlands. His mission is to make the world a better place through technology by inspiring developers all over the world. Before jumping into Developer Advocacy and joining IBM, he founded and worked at multiple startups. His personal mission is making the world better through technology.

Talk Track: Virtual Workshop

Talk Technical Level: 5/7

Talk Abstract:
In today’s world, developers are essential for creating exciting AI applications. They build powerful applications and APIs that use Large Language Models (LLMs), relying on open-source frameworks or tools from LLM providers. In this session, you’ll learn how to build your own AI applications using the watsonx and watsonx.ai ecosystem, including use cases such as Retrieval-Augmented Generation (RAG) and Agents. Through live, hands-on demos, we’ll explore the watsonx.ai developer toolkit and the watsonx.ai Flows Engine. Join us to gain practical skills and unlock new possibilities in AI development!

What You’ll Learn:
By attending this session, you’ll acquire essential skills for effectively leveraging Large Language Models (LLMs) in your projects. You’ll learn to use LLMs via APIs and SDKs, integrate them with your own data, and understand Retrieval-Augmented Generation (RAG) concepts while building RAG systems using watsonx.ai. Additionally, this session will cover Agentic workflows, guiding you through their creation with watsonx.ai. Finally, you’ll explore how to work with various LLMs, including Granite, LLama, and Mistral, equipping you with the versatility needed to optimize AI applications in your development work.

Talk: RAG Hyperparameter Optimization: Translating a Traditional ML Design Pattern to RAG Applications

Presenter:
Niels Bantilan, Chief ML Engineer, Union.ai

About the Speaker:
Niels is the Chief Machine Learning Engineer at Union.ai, and core maintainer of Flyte, an open source workflow orchestration tool, author of UnionML, an MLOps framework for machine learning microservices, and creator of Pandera, a statistical typing and data testing tool for scientific data containers. His mission is to help data science and machine learning practitioners be more productive.

Talk Track: Research or Advanced Technical

Talk Technical Level: 4/7

Talk Abstract:
In the era of Foundation LLMs, a lot of energy has moved from the model training stage to the inference stage of the ML lifecycle, as we can see in the explosion of different RAG architectures. But has a lot changed in terms of the techniques to systematically improve performance of models at inference time? In this talk, we’ll recast hyperparameter optimization in terms of improving RAG pipelines against a “golden evaluation dataset” and see that not much has changed at a fundamental level: gridsearch, random search, and bayesian optimization still apply, and we can use these tried and true techniques for any type of inference architecture. All you need is a high quality dataset.

What You’ll Learn:
You’ll learn about hyperparameter optimization (HPO) techniques that are typically used in model training and apply them to the context of RAG applications. This session will highlight the conceptual and practical differences when implementing HPO in the AI inference setting and see how some of the traditional concepts in ML still apply, such as the bias-variance tradeoff.

Talk: Multi-Graph Multi-Agent systems - Determinism through Structured Representations

Presenter:
Tom Smoker, Technical Founder, WhyHow.AI

About the Speaker:
Co-Founder @ WhyHow.AI

Talk Track: Applied Case Studies

Talk Technical Level: 4/7

Talk Abstract:
As multi-agent systems increasingly get adopted, the range of unstructured information that agents need to process in structured ways, both to return to a user, but also to return to an agent system will increase. We explore the increasing trend of multi-graph multi-agent systems to allow for deterministic information representation and retrieval look like.

What You’ll Learn:
Why structured knowledge representations are important, how structured knowledge representation requirements have changed and will change in an increasingly agentic-driven world with complex multi-agent systems.

Talk: Fast Data Loading for Deep Learning Workloads with lakeFS Mount

Presenter:
Amit Kesarwani, Director, Solution Engineering, lakeFS

About the Speaker:
Amit heads the solution architecture group at Treeverse, the company behind lakeFS, an open-source platform that delivers a Git-like experience to object-storage based data lakes.
Amit has 30+ years of experience as a technologist working with Fortune 100 companies as well as start-ups. Designing and implementing technical solutions for complicated business problems.
As an entrepreneur, he launched a cloud offering to provide Data Warehouse as a Service. Amit holds a Master’s certificate in Project Management from George Washington University and a bachelor’s degree in Computer Science and Technology from Indian Institute of Technology (IIT), India. He is the inventor of the patent: System and Method for Managing and Controlling Data

Talk Track: Virtual Talk

Talk Technical Level: 6/7

Talk Abstract:
Working with large datasets locally allows for a lot more control in your executions and workflows mainly for AI and Deep Learning workloads.

However, this can present a number of tradeoffs that lakeFS Mount helps solve:
• 𝗚𝗶𝘁 𝗶𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗶𝗼𝗻 – Mounting a path in a Git repo automatically tracks the data version, linking it with your code. When checking older code versions, you get the corresponding data version, preventing local-only successes.

• 𝗦𝗽𝗲𝗲𝗱 – Data consistency and performance are guaranteed. lakeFS prefetches commit metadata into a local cache in sub-milliseconds, allowing you to work immediately without having to wait for large dataset downloads.
Intelligent – lakeFS Mount efficiently uses cache, accurately predicting which objects will be accessed. This enables granular pre-fetching for metadata and data files before processing starts.

• 𝗖𝗼𝗻𝘀𝗶𝘀𝘁𝗲𝗻𝗰𝘆 – Working locally risks using outdated or incorrect data versions. With Mount, you can work with consistent, immutable versions, ensuring you know exactly what data version you’re using.

What You’ll Learn
With lakeFS Mount, you can transparently mount an object store reference as a local directory (yes, even at petabyte-scale), while avoiding the common pitfalls typically associated with trying to access an object store as a filesystem.

In this talk, you will learn about lakeFS Mount and you will also see a demonstration of:
• Training a TensorFlow predictive model on data mounted using lakeFS Mount
• Integration with Git to version code and data together
• Reproducibility of code as well as data

Talk: HybridRAG: Merging Knowledge Graphs with Vector Retrieval for Efficient Information Extraction

Presenter:
Bhaskarjit Sarmah, Vice President, BlackRock

About the Speaker:
As a Vice President and Data Scientist at BlackRock, I apply my machine learning skills and domain knowledge to build innovative solutions for the world’s largest asset manager. I have over 10 years of experience in data science, spanning multiple industries and domains such as retail, airlines, media, entertainment, and BFSI.

At BlackRock, I am responsible for developing and deploying machine learning algorithms to enhance the liquidity risk analytics framework, identify price-making opportunities in the securities lending market, and create an early warning system using network science to detect regime change in markets. I also leverage my expertise in natural language processing and computer vision to extract insights from unstructured data sources and generate actionable reports. My mission is to use data and technology to empower investors and drive better financial outcomes.

Talk Track: Virtual Talk

Talk Technical Level: 7/7

Talk Abstract:
In this session we will introduce HybridRAG, a novel approach that combines Knowledge Graphs (KGs) and Vector Retrieval Augmented Generation (VectorRAG) to improve information extraction from financial documents. HybridRAG addresses challenges in analyzing financial documents, such as domain-specific language and complex data formats, which traditional RAG methods often struggle with. By integrating Knowledge Graphs, HybridRAG provides a structured representation of financial data, thereby enhancing the accuracy and relevance of the generated answers. Experimental results demonstrate that HybridRAG outperforms both VectorRAG and GraphRAG individually in terms of retrieval accuracy and answer generation.

What You’ll Learn
Key learnings from this session will include an understanding of the integration of Knowledge Graphs (KGs) and Vector Retrieval Augmented Generation (VectorRAG) to enhance information extraction from financial documents. The paper addresses challenges posed by domain-specific language and complex data formats in financial documents, which are often not well-handled by general-purpose language models. The HybridRAG approach demonstrates improved retrieval accuracy and answer generation compared to using VectorRAG or GraphRAG alone, highlighting its effectiveness in generating contextually relevant answers. Although the focus is on financial documents, the techniques discussed have broader applications, offering insights into the wider utility of HybridRAG beyond the financial domain.

Talk: Robustness with Sidecars: Weak-To-Strong Supervision For Making Generative AI Robust For Enterprise

Presenter:
Dan Adamson, Interim Chief Executive Officer & Co-Founder, AutoAlign AI

About the Speaker:
Dan Adamson is a co-founder of AutoAlign, a company focused on AI safety and performance. He has also co-founded PointChain (developing a neo-banking platform using AI for high-risk and underserved industries) and Armilla AI (a company helping enterprises manage AI risk with risk transfer solutions). He previously founded OutsideIQ, deploying AI-based AML and anti-fraud solutions to over 100 global financial institutions. He also previously served as the Chief Architect at Medstory, a vertical search start-up acquired by Microsoft. Adamson holds several search algorithm and AI patents in addition to numerous academic awards and holding an M.Sc. from U.C. Berkeley and B.Sc. from McGill. He also serves on the McGill Faculty of Science Advisory Board.

Talk Track: Business Strategy or Ethics

Talk Technical Level: 2/7

Talk Abstract:
Many enterprise pilots with GenAI are stalling because of a lack of consistent performance as well as compliance, safety and security concerns. Comprehensive GenAI safety must continually evolve to mitigate critical issues such as hallucinations, jailbreaks, data leakage, biased content, and more.

Learn how AutoAlign CEO and co-founder Dan Adamson leveraged over two decades building regulated AI solutions to launch Sidecar — to ensure models are powerful AND safe. Learn how weak-to-strong controls work to put decisions directly in users’ hands — improving model power while ensuring Generative AI is safe to use.

What You’ll Learn:
During this session, participants will have the opportunity to learn about common approaches to protect GenAI against jailbreaks, bias, data leakage and hallucinations and other harms. We’ll discuss the unique requirements of bringing LLMs to production in real-world applications, the critical importance of ensuring a high level of robustness and safety, and tools for solving these problems.

We’ll then discuss a new approach: weak supervision with a sidecar that can not only increase safety but can also make models more powerful. Finally, we’ll show some of our latest benchmarks around accuracy and discuss these state-of-the-art results.

Talk: Revolutionizing the skies: Mlops case study of LATAM airlines

Presenters:
Michael Haacke Concha, MLOps Lead, LATAM Airlines
Diego Castillo Warnken, Staff Machine Learning Engineer, LATAM Airlines

About the Speaker:
Michael Haacke Concha is the Lead Machine Learning Engineer of the centralized MLOps team at LATAM Airlines. He holds both a Bachelor’s and a Master’s degree in Theoretical Physics from Pontificia Universidad Católica de Chile (PUC). Over his three years at LATAM Airlines, he developed an archival and retrieval system for black box data of the aircraft to support analytics. He then played a key role in building the framework for integrating the Iguazio MLOps platform within the company. In the past year, he has been leading the development of a new platform using Vertex GCP.

Prior to joining LATAM Airlines, Michael worked as a data scientist on the ATLAS experiment at the Large Hadron Collider (LHC), where he contributed to various studies, including the search for a long-lived Dark Photon and a Heavy Higgs.

Diego Castillo is a Consultant Machine Learning Engineer at Neuralworks, currently on assignment as Staff in LATAM Airlines, where he plays a pivotal role within the decentralized Data & AI Operations team. A graduate of the University of Chile with a degree in Electrical Engineering, Diego has excelled in cross-functional roles, driving the seamless integration of machine learning models into large-scale production environments. As a Staff Machine Learning Engineer at LATAM, he not only leads and mentors other MLEs but also shapes the technical direction across key business areas.

Throughout his career at LATAM Airlines, Diego has significantly impacted diverse domains, including Cargo, Customer Care and the App and Landing Page teams. He has more recently been supporting the migration of the MLOPS internal framework from Iguazio to Vertex GCP.

With a comprehensive expertise spanning the entire machine learning lifecycle, Diego brings a wealth of experience from previous roles, including Data Scientist, Backend Developer, and Data Engineer, making him a versatile leader in the AI space.

Talk Track: Applied Case Studies

Talk Technical Level: 2/7

Talk Abstract:
This talk explores how LATAM Airlines leveraged MLOps to revolutionize their operations and achieve financial gain in the hundred of millions of dollars. By integrating machine learning models into their daily workflows and automating the deployment and management processes, LATAM Airlines was able to optimize tariffs, enhance customer experiences, and streamline maintenance operations. The talk will highlight key MLOps strategies employed, such as continuous integration and delivery of ML models, real-time data processing. Attendees will gain insights into the tangible benefits of MLOps, including cost savings, operational efficiencies, and revenue growth, showcasing how strategic ML operations can create substantial value in the airline industry.

What You’ll Learn
You will acquire insight into how a scalable and decentralized tech team grows inside LATAM airlines, thanks to technology and organizational structure. also you will learn some of our successful use cases of our MLOps ecosystem.

Talk: LeRobot: Democratizing Robotics

Presenter:
Remi Cadene, ML for Robotics, Hugging Face

About the Speaker:
I build next-gen robots at Hugging Face. Before, I was a research scientist at Tesla on Autopilot and Optimus. Academically, I did some postdoctoral studies at Brown University and my PhD at Sorbonne.

My scientific interest lies in understanding the underlying mechanisms of intelligence. My research is focused on learning human behaviors with neural networks. I am working on novel architectures, learning approaches, theoritical frameworks and explainability methods. I like to contribute to open-source projects and to read about neuroscience!

Talk Track: Virtual Talk

Talk Technical Level: 3/7

Talk Abstract:
Learn about how LeRobot aims to lower the barrier of entry to robotics, and how you can get started!

What You’ll Learn
1. What LeRobot’s mission is.
2. Ways in which LeRobot aims to lower the barrier of entry to robotics.
3. How you can get started with you own robot.
4. How you can get involved in LeRobot’s development.

Talk: From ML Repository to ML Production Pipeline

Presenters:
Jakub Witkowski, IT Expert, Roche Informatics | Dariusz Adamczyk, IT Expert, Roche Informatics

About the Speaker:
Jakub Witkowski, PhD is a data scientist and MLOps engineer with experience spanning various industries, including consulting, media, and pharmaceuticals. At Roche, he focuses on understanding the needs of data scientists to help them make their work and models production-ready. He achieves this by providing comprehensive frameworks and upskilling opportunities.

Dariusz is a DevOps and MLOps engineer. He has experience in various industries such as public cloud computing, telecommunications, and pharmaceuticals. At Roche, he focuses on infrastructure and the process of deploying machine learning models into production.

Talk Track: Virtual Talk

Talk Technical Level: 4/7

Talk Abstract:
In the pRED MLOps team, we collaborate closely with research scientists to transition their machine learning models into a production environment seamlessly. Through our efforts, we have developed a robust framework that standardises and scales this process effectively. In this talk, we will provide an in-depth look at our framework, the tools we leverage, and the challenges we overcome in this journey.

What You’ll Learn
– How to create framework for moving ML code to production
– What can be automated in this process (role of containerisation, CI/CD, building reusable components for repeating tasks)
– What tools are important for dev team
– What are most important challenges to tackle in this process

Talk: Striking the Balance: Leveraging Human Intelligence with LLMs for Cost-Effective Annotations

Presenter:
Geoff LaPorte, Applied AI Solutions Architect, Appen

About the Speaker:
Geoff is a seasoned tech innovator with over 13 years of experience, transitioning from management consulting to software development. He specializes in bridging the gap between technology and business strategy, consistently delivering user-focused, high-impact solutions. Geoff is known for pushing boundaries and tackling complex technology challenges with a passion.

Talk Track: Applied Case Studies

Talk Technical Level: 7/7

Talk Abstract:
Data annotation involves assigning relevant information to raw data to enhance machine learning (ML) model performance. While this process is crucial, it can be time-consuming and expensive. The emergence of Large Language Models (LLMs) offers a unique opportunity to automate data annotation. However, the complexity of data annotation, stemming from unclear task instructions and subjective human judgment on equivocal data points, presents challenges that are not immediately apparent.

In this session, Chris Stephens, Field CTO and Head of AI Solutions at Appen will provide a overview of an experiment that the company recently conducted to test the tradeoff between quality and cost of training ML models via LLMs vs human input. Their goal was to differentiate between utterances that could be confidently annotated by LLMs, and those that required human intervention. This differentiation was crucial to ensure a diverse range of opinions or to prevent incorrect responses from overly general models. Chris will walk audience members through the dataset used as well as methodology for the experiment, as well as the company’s research findings.

What You’ll Learn
Geoff will walk audience members through an experiment that highlights a key issue with using a vanilla LLM—it might struggle with complex real-world tasks. Researchers recommend exercising caution when relying solely on LLMs for annotation. Instead, a balanced approach combining human input with LLM capabilities is recommended, considering their complementary strengths in terms of annotation quality and cost-efficiency.

Talk: ML Deployment at Faire: Predicting the Future, Serving the Present

Presenter:
Harshit Agarwal, Senior Machine Learning Engineer, Faire Wholesale Inc

About the Speaker:
How Faire transitioned a traditional infrastructure into a modern, flexible model deployment and serving stack that supports a range of model types, while ensuring operational excellence and scalability in a dynamic e-commerce environment.

Over the past few years at Faire, we have overhauled our ML serving infrastructure, moving from hosting XGBoost models in a monolithic service to a flexible and powerful ML deployment and serving stack that powers all types of models, small and big.

In this talk, we’ll cover how we set up a system that makes it easy to migrate, deploy, scale, and manage different types of models. Key points will include how we set up infrastructure as code and CI/CD pipelines for smooth deployment, automated testing, and created user-friendly tools for managing model releases. We’ll also touch on how we built in observability and monitoring to keep an eye on model performance and reliability.

Come and learn how Faire’s ML serving stack helps our team quickly bring new ideas to life, while also maintaining the operational stability needed for a growing marketplace.

Talk Track: Research or Advanced Technical

Talk Technical Level: 5/7

Talk Abstract:
How Faire transitioned a traditional infrastructure into a modern, flexible model deployment and serving stack that supports a range of model types, while ensuring operational excellence and scalability in a dynamic e-commerce environment.

Over the past few years at Faire, we have overhauled our ML serving infrastructure, moving from hosting XGBoost models in a monolithic service to a flexible and powerful ML deployment and serving stack that powers all types of models, small and big.

In this talk, we’ll cover how we set up a system that makes it easy to migrate, deploy, scale, and manage different types of models. Key points will include how we set up infrastructure as code and CI/CD pipelines for smooth deployment, automated testing, and created user-friendly tools for managing model releases. We’ll also touch on how we built in observability and monitoring to keep an eye on model performance and reliability.

Come and learn how Faire’s ML serving stack helps our team quickly bring new ideas to life, while also maintaining the operational stability needed for a growing marketplace.

What You’ll Learn
1. How to best structure an ML serving and deployment infrastruture
2. How to build testing and observability into your deployment and serving infra
3. How to build production grade tools that your data scientists and MLEs will love
4. See how we are serving users at scale and the design choices that we made

Talk: Memory Optimizations for Machine Learning

Presenter:
Tejas Chopra, Senior Software Engineer, Netflix

About the Speaker:
Tejas Chopra is a Senior Software Engineer, working in the Data Storage Platform team at Netflix, where he is responsible for architecting storage solutions to support Netflix Studios and Netflix Streaming Platform. Prior to Netflix, Tejas was working on designing and implementing the storage infrastructure at Box, Inc. to support a cloud content management platform that scales to petabytes of storage & millions of users. Tejas has worked on distributed file systems & backend architectures, both in on-premise and cloud environments as part of several startups in his career. Tejas is an International Keynote Speaker and periodically conducts seminars on Micro services, NFTs, Software Development & Cloud Computing and has a Masters Degree in Electrical & Computer Engineering from Carnegie Mellon University, with a specialization in Computer Systems.

Talk Track: Research or Advanced Technical

Talk Technical Level: 5/7

Talk Abstract:
As Machine Learning continues to forge its way into diverse industries and applications, optimizing computational resources, particularly memory, has become a critical aspect of effective model deployment. This session, “”Memory Optimizations for Machine Learning,”” aims to offer an exhaustive look into the specific memory requirements in Machine Learning tasks, including Large Language Models (LLMs), and the cutting-edge strategies to minimize memory consumption efficiently.

We’ll begin by demystifying the memory footprint of typical Machine Learning data structures and algorithms, elucidating the nuances of memory allocation and deallocation during model training phases. The talk will then focus on memory-saving techniques such as data quantization, model pruning, and efficient mini-batch selection. These techniques offer the advantage of conserving memory resources without significant degradation in model performance.
A special emphasis will be placed on the memory footprint of LLMs during inferencing. LLMs, known for their immense size and complexity, pose unique challenges in terms of memory consumption during deployment. We will explore the factors contributing to the memory footprint of LLMs, such as model architecture, input sequence length, and vocabulary size. Additionally, we will discuss practical strategies to optimize memory usage during LLM inferencing, including techniques like model distillation, dynamic memory allocation, and efficient caching mechanisms.
By the end of this session, attendees will have a comprehensive understanding of memory optimization techniques for Machine Learning, with a particular focus on the challenges and solutions related to LLM inferencing.

What You’ll Learn
By the end of this session, attendees will have a comprehensive understanding of memory optimization techniques for Machine Learning, including: pruning, quantization, distillation, etc. and where to apply them. They will also learn about how to implement these techniques using pytorch.

Talk: From Black Box to Glass Box: Interpreting your Model

Presenter:
Zachary Carrico, Senior Machine Learning Engineer, Apella

About the Speaker:
Zac is a Senior Machine Learning Engineer at Apella, specializing in machine learning products for improving surgical operations. He has a deep interest in healthcare applications of machine learning, and has worked on cancer and Alzheimer’s disease diagnostics. He has end-to-end experience developing ML systems: from early research to serving thousands of daily customers. Zac is an active member of the ML community, having presented at conferences such as Ray Summit, TWIMLCon, and Data Day. He has also published eight journal articles. He is passionate about advancing model interpretability and reducing model bias. In addition, he has extensive experience in improving MLOps to streamline the deployment and monitoring of models, reducing complexity and time. Outside of work, Zac enjoys spending time with his family in Austin and traveling the world in search of the best surfing spots.

Talk Track: Research or Advanced Technical

Talk Technical Level: 5/7

Talk Abstract:
Interpretability is crucial for improving model performance, reducing biases, and ensuring compliance with AI safety and fairness regulations. In this session, complex neural networks will be transformed from opaque “black boxes” into interpretable “glass boxes” by exploring a wide range of neural network-specific interpretability techniques. Attendees will learn about methods such as saliency maps, integrated gradients, Grad-CAM, SHAP, and activation maximization. The session will combine theoretical explanations with practical demonstrations, helping attendees effectively improve transparency and trust in neural network predictions.

What You’ll Learn
Attendees will learn how to apply various neural network interpretability techniques to understand model behavior better. They will gain insights into methods such as saliency maps, Grad-CAM for visualizing important regions in images, and integrated gradients for attributing feature importance. The session will also cover feature visualization methods to understand neuron activations and how to use layer-wise relevance propagation to track the impact of inputs through network layers. By the end of the session, participants will know how to use these tools to make neural networks more understandable and how to communicate the insights to diverse stakeholders.

Talk: Secure and Scalable MLOps Pipelines for Generative AI in Cloud-Native Architectures

Presenter:
Sumit Dahiya, Solution Architect (Vice President), Barclays Americas

About the Speaker:
Sumit Dahiya is an accomplished Solution Architect and Vice President at Barclays Americas, with over 18 years of experience in technology, cybersecurity, and digital transformation. He specializes in Identity and Access Management (IAM), cloud-native architectures, microservices, and the integration of AI/ML technologies into enterprise systems. Sumit has successfully led numerous large-scale projects across various industries, including finance, telecom, and retail.

Throughout his career, Sumit has been instrumental in designing scalable, secure infrastructures, and pioneering innovative solutions for some of the most complex technological challenges. His contributions include spearheading the deployment of Microsoft VIVA at Barclays, leading the Request/Breakglass System migration, and building cloud-based IAM frameworks that ensure enhanced security, compliance, and operational efficiency.

Sumit has been recognized with numerous awards, including the Global Recognition Award and Influencer of the Year by the Asian African Economic Forum. He is also a Top Voice on LinkedIn for System Architecture and an active mentor on platforms like ADP List and Startupbootcamp, where he helps guide future leaders in technology.

A published author and researcher, Sumit has contributed to peer-reviewed journals and conferences on topics such as cloud security, machine learning, and enterprise architecture. His recent research papers include works on AI-powered cloud solutions and Identity and Access Management integration.

In addition to his technical expertise, Sumit is a regular speaker at international conferences, sharing his insights on overcoming the challenges of Generative AI adoption, digital transformation, and scalable enterprise architectures. He has also served as a judge for prestigious awards like the Brandon Hall Excellence Awards and the Globee Leadership Awards.

Talk Track: Virtual Talk

Talk Technical Level: 5/7

Talk Abstract:
Generative AI has rapidly evolved, offering immense potential to revolutionize industries by enabling intelligent automation, creative content generation, and personalized experiences. However, deploying and managing Generative AI models at scale comes with unique challenges, particularly around security, scalability, and integration in cloud-native environments.

In this talk, we will explore how to build secure and scalable MLOps pipelines tailored for Generative AI models, addressing critical challenges in model deployment, version control, data governance, and lifecycle management. The session will focus on designing robust pipelines that can effectively handle the resource-intensive nature of Generative AI while ensuring security and compliance.

What You’ll Learn:
Architecting Scalable MLOps Pipelines for Generative AI:

How to design modular and microservices-based MLOps pipelines that scale to meet the computational demands of Generative AI models.
Techniques to optimize cloud-native infrastructure (AWS, Azure, Google Cloud) using containerization (Docker, Kubernetes) and orchestration tools for efficient resource management.
Securing the MLOps Pipeline for Generative AI:

Best practices for securing every stage of the Generative AI lifecycle, from data ingestion and model training to deployment and monitoring.
How to implement Identity and Access Management (IAM) to control and audit access to models, sensitive data, and outputs, ensuring the security of the AI workflow.
Implementing Compliance and Governance in AI Workflows:

Strategies to ensure compliance with industry regulations like GDPR, HIPAA, and SOX when deploying Generative AI models.
How to integrate governance frameworks into your MLOps pipeline to enhance transparency, bias mitigation, and ethical AI practices.
Building Continuous Integration/Continuous Deployment (CI/CD) for Generative AI:

Techniques to automate the model training and deployment process using CI/CD pipelines, ensuring that your AI systems are continuously updated and improved.
How to monitor models in production for real-time performance, detect model drift, and ensure ongoing security and compliance.
Real-World Case Studies and Practical Insights:

Learn from case studies of successful Generative AI deployments in cloud-native environments, showcasing best practices for secure, scalable, and reliable MLOps pipelines.
Practical insights into overcoming the common challenges faced when operationalizing Generative AI, including cost management, latency, and maintaining high availability.

Talk: LLMs in Vision Models

Presenter:
Arpita Vats, Senior AI Engineer

About the Speaker:
I am a Senior AI Engineer at LinkedIn with expertise in AI, Deep Learning, NLP, and Computer Vision. I have experience from Meta and Amazon, where I focused on LLM and Generative AI. I have published papers and led projects enhancing recommendation algorithms and multimedia models for various industry applications.

Talk Track: Virtual Talk

Talk Technical Level: 5/7

Talk Abstract:
The integration of Large Language Models (LLMs) in vision-based AI systems has sparked a new frontier in multimedia understanding. Traditional vision models, while powerful, often lack the ability to comprehend contextual information beyond visual features. By incorporating LLMs, vision models can process both visual and textual information, creating a more holistic and interpretable understanding of multimedia content. This presentation will explore the convergence of LLMs with vision models, highlighting their application in image captioning, object recognition, and multimodal recommendation systems.

What You’ll Learn:
By attending this presentation, the audience will learn how Large Language Models (LLMs) can enhance the capabilities of vision-based AI systems, creating more context-aware and interpretable multimedia models. Attendees will gain insights into the architecture and integration techniques used to combine vision and language models, practical industry applications, and the challenges and solutions associated with building these advanced systems. They will leave with a deeper understanding of how LLMs in vision models are transforming multimedia analysis, enabling more accurate, scalable, and personalized AI-driven solutions.

Panel: Achieving Long-Term AI Growth: Real Problem Solving vs. Trend-Based Solutions
Presenters: Carly Taylor, Founder, Rebel Data Science | Joe Reis, Recovering Data, Best-Selling Author Scientist, Ternary Data, LLC | Jepson Taylor, Former Chief AI Strategist, VEOX Inc | Hugo Bowne-Anderson, Independent Data and AI Scientist, hey.com About the Speaker: Carly is a data scientist, computational chemist and machine learning engineer. She obtained her M.S. in chemistry from the University of Colorado focusing on computational quantum dynamics. She has authored multiple peer-reviewed publications and holds two non-provisional machine learning patents. When she isn’t writing about herself in the third person, building mechanical keyboards or neglecting the oxford comma, she works as a director of security strategy for Call of Duty at Activision Publishing. Joe Reis is a “recovering data scientist” who’s worked in a variety of data roles (analytics, machine learning, engineering, architecture, etc) since the early 2000s. Joe is the co-author of The Fundamentals of Data Engineering (O’Reilly, 2022). Jepson is a popular speaker in the AI space having been invited to give AI talks to companies like Space X, Red Bull, Goldman Sachs, Amazon, and various branches of the US government. Jepson’s applied career has covered semiconductor, quant finance, HR analytics, deep-learning startup, and AI platform companies. Jepson co-founded and sold his deep-learning company Zeff.ai to DataRobot in 2020 and later joined Dataiku as their Chief AI Strategist. Jepson is currently launching a new AI company focused on the next generation of AI called VEOX Inc. Talk Track: Panel Talk Technical Level: TBD Talk Abstract: TBD What You’ll Learn: TBD
Panel: The Current Investment Landscape. Opportunities & Challenges in ML/Gen AI

Presenters:
George Matthew, Managing Director, Insight Partners | Prerna Sharma, General Partner, Antler | Mark Weber, Fellow / Investor, MIT Media Lab/Tectonic Ventures

About the Speaker:
George Mathew is a Managing Director at Insight Partners, where he focuses on venture stage investments in AI, ML, Analytics, and Data companies as they are establishing product/market Fit.

He brings 20+ years of experience developing high-growth technology startups including most recently being CEO of Kespry. Prior to Kespry, George was President & COO of Alteryx where he scaled the company through its IPO (AYX). Previously he held senior leadership positions at SAP and salesforce.com. He has driven company strategy, led product management and development, and built sales and marketing teams.

George holds a Bachelor of Science in Neurobiology from Cornell University and a Masters in Business Administration from Duke University, where he was a Fuqua Scholar.

General Partner at Antler

Talk Track: Business Strategy

Talk Technical Level: 1/7

Talk Abstract:
TBD

What You’ll Learn:
TBD

Panel: Toyota's Generative AI Journey

Presenters:
Ravi Chandu Ummadisetti, Generative AI Architect, Toyota | Stephen Ellis, Technical Generative AI Product Manager, Toyota | Kordel France, AI Architect, Toyota | Eric Swei, AI Architect, Toyota

About the Speaker:
Ravi Chandu Bio (Generative AI Architect): Ravi Chandu Ummadisetti is a distinguished Generative AI Architect with over a decade of experience, known for his pivotal role in advancing AI initiatives at Toyota Motor North America. His expertise in AI/ML methodologies has driven significant improvements across Toyota’s operations, including a 75% reduction in production downtime and the development of secure, AI-powered applications. Ravi’s work at Toyota, spanning manufacturing optimization, legal automation, and corporate AI solutions, showcases his ability to deliver impactful, data-driven strategies that enhance efficiency and drive innovation. His technical proficiency and leadership have earned him recognition as a key contributor to Toyota’s AI success.

Stephen Ellis Bio (Technical Generative AI Product Manager): 10 years of experience in research strategy and the application of emerging technologies for companies as small as startups to Fortune 50 Enterprises. Former Director of the North Texas Blockchain Alliance where leading the cultivation of the Blockchain and Cryptocurrency competencies among software developers, C-level executives, and private investment advisors. Formerly the CTO of Plymouth Artificial Intelligence which was researching and developing future applications of AI. In this capacity advised companies on building platforms that seek to leverage emerging technologies for new business cases. Currently Technical Product Manager at Toyota Motors North America focused on enabling generative AI solutions for various group across the enterprise to drive transformation in developing new mobility solutions and enterprise operations.

Kordel France Bio (AI Architect): Kordel brings a diverse background of experiences in robotics and AI from both academia and industry. He has multiple patents in advanced sensor design and spent much of the past few years founding and building a successful sensor startup that enables the sense of smell for robotics. He is on the board of multiple startups and continues to further his AI knowledge as an AI Architect at Toyota.

Eric Swei Bio (Senior Generative AI Architect): Boasting an impressive career spanning over two decades, Eric Swei is an accomplished polymath in the tech arena, with deep-seated expertise as a full stack developer, system architect, integration architect, and specialist in computer vision, alongside his profound knowledge in generative AI, data science, IoT, and cognitive technologies.

At the forefront as the Generative AI Architect at Toyota, Eric leads a formidable team in harnessing the power of generative AI. Their innovative endeavors are not only enhancing Toyota’s technological prowess but also redefining the future of automotive solutions with cutting-edge AI integration.

Talk Track: Case Study

Talk Technical Level: 2

Talk Abstract:
Team Toyota will delve into their innovative journey with generative AI in automotive design, with the talk exploring how the Toyota research integrates traditional engineering constraints with state-of-the-art generative AI techniques, enhancing designers’ capabilities while ensuring safety and performance considerations.

What You’ll Learn:
1. Toyota’s Innovation Legacy
2. Leveraging LLMs in Automotive – battery, vehicle, manufacturing, etc
3. Failures in Generative AI projects
4. Education to business stakeholders

Talk: Building AI Infrastructure for GenAI Wave

Presenter:
Shreya Rajpal, CEO & Co-Founder, Guardrails AI

About the Speaker:
Shreya Rajpal is the CEO of Guardrails AI, an open source platform developed to ensure increased safety, reliability and robustness of large language models in real-world applications. Her expertise spans a decade in the field of machine learning and AI. Most recently, she was the founding engineer at Predibase, where she led the ML infrastructure team. In earlier roles, she was part of the cross-functional ML team within Apple’s Special Projects Group and developed computer vision models for autonomous driving perception systems at Drive.ai.

Talk Abstract:
As Generative AI (GenAI) continues to revolutionize industries, it brings a new set of risks and challenges. This talk focuses on building robust AI infrastructure to manage and mitigate these risks. We will explore the multifaceted nature of GenAI risks and the essential infrastructure components to address them effectively. Key topics include implementing real-time monitoring systems to identify anomalies and biases, designing audit trails for enhanced transparency and developing adaptive security measures to combat emerging threats.

The presentation will also cover governance strategies for GenAI, and the integration of ethical AI frameworks to support responsible development and deployment. This talk is tailored for CISOs, AI ethics officers, ML engineers, and IT architects aiming to build secure and responsible GenAI systems.

Talk: Supercharge ML Teams: ZenML's Real World Impact in the MLOps Jungle

Presenter:
Adam Probst, CEO & Co-Founder, ZenML

About the Speaker:
Adam Probst is the Co-founder and CEO of ZenML, an open-source MLOps framework simplifying machine learning pipelines. He holds a degree in Mechanical Engineering and studied at both Stanford University and the Technical University of Munich. Before co-founding ZenML, Adam gained valuable experience in the ML startup world within the commercial vehicle industry. Driven by a passion for customer-centric solutions, Adam is obsessed with unlocking the tangible benefits of MLOps for businesses.

Talk Abstract:
Supercharge ML Teams: ZenML’s Real World Impact in the MLOps Jungle
In the complex ecosystem of machine learning operations, teams often find themselves entangled in a dense jungle of tools, workflows, and infrastructure challenges. This talk explores how ZenML, an open-source MLOps framework, is cutting through the underbrush to create clear paths for ML teams to thrive.
We’ll dive into real-world case studies demonstrating how ZenML has empowered organizations to streamline their ML pipelines, from experimentation to production. Attendees will learn how ZenML addresses common pain points such as reproducibility, scalability, and collaboration, enabling teams to focus on innovation rather than operational overhead.
Key topics include:

Navigating the MLOps tooling landscape with ZenML as your compass
Achieving seamless transitions from laptop to cloud deployments
Enhancing team productivity through standardized, yet flexible, ML workflows
Lessons learned from implementing ZenML in diverse industry settings

Whether you’re a data scientist, ML engineer, or team lead, you’ll gain practical insights on how to leverage ZenML to supercharge your ML initiatives and conquer the MLOps jungle.

Talk: Scale Expert Review by 10x to Ship AI Apps at Lightning Speed

Presenter:
Niklas Nielsen, CTO, Co-Founder, Log10

About the Speaker:
Niklas Nielsen is CTO & Co-founder of Log10.io, a platform that rapidly measures and improves the accuracy of LLM applications by scaling your subject matter experts. Nik previously was Head of Product at MosaicML (acq. Databricks). Prior to that he worked at Intel and Mesosphere on building Distributed Systems deployed at large scale at companies such as Twitter and Apple, and at Adobe on the Virtual Machines and Compilers team. He co-founded CustomerDB, a startup applying AI to product management.

Talk Abstract:
The time and expense of subject matter expert (SME) review is a major barrier to developing generative AI applications, especially for high-risk use cases such as healthcare, finance, insurance, and more. Log10 scales SME review by 10x or more to accelerate deployment to production.

Our AutoFeedback system customizes domain-specific evaluation models that review LLM completions in real time with near-human accuracy, leveraging proprietary Latent Space Readout technology that needs 90% less data than fine-tuned evaluation model approaches. With as few as 20 SME-labeled examples, dev teams can rapidly assess and enhance the accuracy of their generative AI app.

We’ll demo AutoFeedback in a summarization use case, generating scores that assess the quality of CNN news summaries in real time. We’ll show that Latent Space Readout delivers superior accuracy to LLM-as-a-judge, and is cheaper and faster to use than fine tuning an evaluation model with comparable accuracy.

Talk: Finding the Hidden Drivers of AI Business Value

Presenter:
Jakob Frick, CEO & Co-Founder, Radiant

About the Speaker:
Jakob Frick is the CTO and Co-founder of Radiant AI. Before that he worked at Palantir Technologies across a range of areas from Covid Vaccine distribution work with the NHS, to National-scale Cyber defense to Model integration across platforms. Before that he worked on Open Source software with JP Morgan Chase.

Talk Abstract:
How do you know how well your AI products are actually working? In this talk we will explore how companies are looking beyond evaluations to tie LLM activity to their business outcomes. We’ll look at case studies and examples of the field as well as a framework for identifying the metrics that really move the needle in creating value with Generative AI.

Talk: Effective Workflows for Delivering Production-Ready LLM Apps

Presenter:
Ariel Kleiner, CEO & Co-Founder, Inductor

About the Speaker:
Ariel Kleiner is the CEO and founder of Inductor, which enables teams to deliver production-ready LLM applications significantly faster, more easily, and more reliably. Ariel was previously at Google AI, cofounded Idiomatic, and holds a PhD in computer science (specifically, in machine learning) from UC Berkeley.

Talk Abstract:
Going from an idea to an LLM application that is actually production-ready (i.e., high-quality, trustworthy, cost-effective) is difficult and time-consuming. In particular, LLM applications require iterative development driven by experimentation and evaluation, as well as navigating a large design space (with respect to model selection, prompting, retrieval augmentation, fine-tuning, and more). The only way to build a high-quality LLM application is to iterate and experiment your way to success, powered by data and rigorous evaluation; it is essential to then also observe and understand live usage to detect issues and fuel further improvement. In this talk, we cover the prototype-evaluate-improve-observe workflow that we’ve found to work well, and actionable insights as to how to apply this workflow in practice.

Talk: Optimizing LLM Apps Through Usage: Implicit Feedback, Given Explicitly

Presenter:
Chinar Movsisyan, CEO, Feedback Intelligence

About the Speaker:
Chinar Movsisyan is the founder and CEO of Feedback Intelligence, an MLOps company based in San Francisco that enables enterprises to make sure that LLM-based products are reliable and that the output is aligned with end-user expectations. With over eight years of experience in deep learning, spanning from research labs to venture-backed startups, Chinar has led AI projects in mission-critical applications such as healthcare, drones, and satellites.

Talk Abstract:
In the rapidly evolving landscape of LLM-powered applications, AI teams face a unique challenge: gathering actionable insights from real-world usage to continuously improve app performance. This presentation will explore the common obstacles teams encounter when attempting to optimize their applications based on user interactions. We will dive into how implicit feedback—often hidden within everyday user behavior—can be harnessed effectively to drive measurable improvements. By sharing our approach to extracting and leveraging this data, we’ll demonstrate how it accelerates the development of smarter, more responsive LLM applications.

Talk: Multimodal Agents You Can Deploy Anywhere

Presenter:
David Cheng, Engineering Lead, Reka AI

About the Speaker:
David Cheng is an Engineering Lead at Reka, an AI Research and Product company building multimodal artificial intelligence to empower organisations and businesses. He holds a degree in Computer Science from Caltech. Before Reka, David worked at Google Cloud, leading teams around distributed databases and applied ML.

Talk Abstract:
Reka develops multimodal AI that can be deployed in the cloud, on premises, or on devices. Our frontier models are trained from scratch in an end-to-end fashion to understand text, images, video, and audio. They address the needs of both enterprises and consumers for building powerful applications such as video analysis, speech-to-speech translation, and multimodal document understanding. Join us as we share about how you can use Reka models and our agentic framework to build agents that can see, hear, and speak.

Talk: How to Build Your Own LLM User Feedback Loop with Nebuly

Presenter:
Zunair Waseem, Founding GTM, Nebuly AI

About the Speaker:
Zunair Waseem is a graduate of the University of North Texas with a background in information systems. He originally began his career in the non-profit sector, doing humanitarian work overseas. After being in the space for nearly half a decade, Zunair became part of the founding team at Nebuly. He enjoys spending time outdoors, reading, being with family, and is committed to continuous learning. He also speaks three languages.

Talk Abstract:
User feedback is key to turning any good product into a great one, and LLM-powered products are no exception. However, less than 1% of LLM users provide explicit feedback (thumbs up/down), making it difficult to improve LLM responses and enhance the user experience. Learn how Nebuly helps companies with LLMs in production to build their own LLM User Feedback Loop.

Talk: Overcoming Challenges in Deploying Successful MLOps Solutions

Presenter:
Aaron Cheng, VP of Data Science, dotData

About the Speaker:
Aaron is currently the Vice President of Data Science and Solutions at dotData. As a data science practitioner with 14 years of research and industrial experience, he has held various leadership positions in spearheading new product development in the fields of data science and business intelligence. At dotData, Aaron leads the data science team in working directly with clients and solving their most challenging problems.

Prior to joining dotData, he was a Data Science Principle Manager with Accenture Digital, responsible for architecting data science solutions and delivering business values for the tech industry on the West Coast. He was instrumental in the strategic expansion of Accenture Digital’s footprint in the data science market in North America.

Aaron received his Ph.D. degree in Applied Physics from Northwestern University.

Talk Abstract:
Building effective MLOps systems requires overcoming challenges like complex data pipelines, scaling, and maintaining model accuracy. This presentation will highlight key strategies to ensure models continuously adapt and perform well in changing data environments.

Key Takeaways:

How to automate feature engineering to speed up model development.

Why it’s critical to implement feature lineage tracking for increased transparency.

Deploying drift detection systems to monitor and safeguard model performance.

Early feature drift detection, combined with auto-retraining, prevents model degradation.

Real-world success using automated monitoring and maintenance in MLOps.

Talk: How to Avoid the Common Pitfalls when Scaling Your MLOps

Presenter:
Eero Laaksonen, CEO & Founder, Valohai

About the Speaker:
Serial entrepreneur hippie with a keen interest in making the world a better place. I believe people should work less and enjoy life more. Currently escalating the adoption of machine learning in enterprises around the world with Valohai.

Talk Abstract:
After speaking to thousands of companies over the years, we identified the exact reasons why many in-house ML teams struggle to scale efficiently.

Without spoiling the talk, this happens because of how most ML teams, their tech stacks, and processes tend to evolve over time.

The good news is that it’s never to late to change the way of thinking about MLOps and, as a result, unlock the path to scaling ML teams most efficiently.

What You’ll Learn:
– How ML teams and stacks evolve over time
– The common pitfalls when scaling MLOps
– The new way of thinking about MLOps
– How to scale your ML team efficiently

Talk: Building Hyper-Personalized LLM Applications with Rich Contextual Data

Presenter:
Sergio Ferragut, Principal Developer Advocate, Tecton

About the Speaker:
Sergio is the Principal Developer Advocate at Tecton where he partners closely with the Product and Engineering teams to ensure a seamless developer experience. He has extensive experience in Data Analytics, Data Engineering, Databases, Data Warehousing, Business Intelligence, Architecture, and Big Data.

Talk Abstract:
In the era of AI-driven applications, personalization is paramount. This talk explores the concept of Full RAG (Retrieval-Augmented Generation) and its potential to revolutionize user experiences across industries. We examine four levels of context personalization, from basic recommendations to highly tailored, real-time interactions.
The presentation demonstrates how increasing levels of context – from batch data to streaming and real-time inputs – can dramatically improve AI model outputs. We discuss the challenges of implementing sophisticated context personalization, including data engineering complexities and the need for efficient, scalable solutions.
Introducing the concept of a Context Platform, we showcase how tools like Tecton can simplify the process of building, deploying, and managing personalized context at scale. Through practical examples in travel recommendations, we illustrate how developers can easily create and integrate batch, streaming, and real-time context using simple Python code, enabling more engaging and valuable AI-powered experiences.

Talk: Enterprise AI Alignment: Engineering Trust into GenAI Systems

Presenter:
Ron Baker, Chief Technology Officer, Trustwise

About the Speaker:
Ron Baker is the Chief Technology Officer of Trustwise, focused on bringing his expertise in enterprise software scale, reliability, security and usability to the innovative features our team has delivered. As a recently retired IBM Distinguished Engineer from their Sustainability Software organization, with products that manage Assets, Building Facilities, Supply Chains, Weather and Environmental Intelligence and ESG Reporting product suites, featuring climate risk analysis and financed emissions. He led the Environmental Insights Strategy, the SRE discipline and operations technology for SaaS offerings.

Talk Abstract:
In the race to build powerful AI systems, alignment and trustworthiness have become central challenges for both researchers and enterprises. While organizations like OpenAI and Anthropic focus on general alignment for Artificial General Intelligence (AGI), ensuring AI models behave ethically and follow human values, businesses face an additional layer of complexity. For enterprises, especially in regulated industries like healthcare, finance, and insurance, AI systems must meet strict operational, ethical, and regulatory standards to ensure safety and compliance.

This session will dive into the practical aspects of AI alignment, focusing on how businesses can balance the competing demands of performance, safety, and compliance while ensuring their AI systems are trustworthy and aligned with real-world goals.

Talk: Your Entire AI/ML Lifecycle in A Single Platform

Presenter:
Hudson Buzby, Solution Architect, JFrog

About the Speaker:
Hudson Buzby is a Solutions Architect, helping customers build and design Machine Learning systems at Jfrog. Prior to that, Hudson spent a number of years in the Data Engineering space, particularly in the world of Apache Spark, Kafka, and distributed systems.

Talk Abstract:
Dive into the world of deploying AI/ML applications to production with JFrog ML. This quick session will showcase our advanced MLOps, LLMOps, and Feature Store capabilities designed to streamline your AI development processes.

  • Watch how JFrog ML streamlines AI workflows from ideation to production, ensuring efficiency and quality.
  • Discover tools designed for managing and optimizing large language models (LLMs) effectively.
  • Learn how feature stores can significantly improve data management and model performance in your AI projects.
Talk: ML Feature Lakehouse: Empowering Data Scientists to Build Petabyte-Scale Pipelines with Iceberg

Presenter:
Simba Khadder, Founder & CEO, Featureform

About the Speaker:
Simba Khadder is the Founder & CEO of Featureform. After leaving Google, Simba founded his first company, TritonML. His startup grew quickly and Simba and his team built ML infrastructure that handled over 100M monthly active users. He instilled his learnings into Featureform’s virtual feature store. Featureform turns your existing infrastructure into a Feature Store. He’s also an avid surfer, a mixed martial artist, a published astrophysicist for his work on finding Planet 9, and he ran the SF marathon in basketball shoes.

Talk Abstract:
The term “Feature Store” might sound like just a place to store features, but in reality, it’s a powerful system for defining, managing, and deploying large-scale data pipelines. This session will simplify feature stores by breaking down the three main types and showing how they fit into an ML ecosystem. We’ll explore how feature stores enable data scientists to build, manage, and scale their own pipelines, even at petabyte levels, while handling streaming data and ensuring versioning and lineage.

Join Simba Khadder, founder and CEO of Featureform, as he cuts through the jargon and delivers practical, real-world examples. You’ll learn how feature stores can be used to build scalable data pipelines for AI/ML, and get a clear roadmap for integrating them into your ML workflows.

We’ll also take a look under the hood to see how Featureform achieve this scale using Apache Iceberg, so you leave with actionable insights to improve your ML platforms and projects.

Talk: Efficient AI Scaling: How VESSL AI Enables 100+ LLM Deployments for $10 and Saves $1M Annually

Presenter:
Jaeman An, Co-Founder & CEO, VESSL AI

About the Speaker:
Jaeman An is the CEO of VESSL AI and a graduate of KAIST with a background in Electrical and Electronic Engineering. He has extensive experience in machine learning and DevOps, having previously led the DevOps infrastructure for the mobile game ‘Cookie Run,’ which achieved 10 million daily active users. He also served as VP of Engineering at a medical AI startup, launching multiple successful AI services. His drive to solve inefficiencies in AI deployment and operations led him to found VESSL AI.

Talk Abstract:
This session will demonstrate how VESSL AI enables enterprises to efficiently scale and deploy over 100 Large Language Models (LLMs) starting at just $10, saving up to $1M annually in cloud costs. We will explore real-world case studies from industries like finance, healthcare, and e-commerce, showcasing practical solutions to optimize infrastructure and reduce operational costs.

Talk: Catching Bad Guys Using Open Data and Open Models for Graphs to Power AI App

Presenter:
Paco Nathan, Principal DevRel Engineer, Senzing

About the Speaker:
Paco Nathan leads DevRel for the Entity Resolved Knowledge Graph practice area at Senzing.com and is a computer scientist with +40 years of tech industry experience and core expertise in data science, natural language, graph technologies, and cloud computing. He’s the author of numerous books, videos, and tutorials about these topics.

Paco advises Argilla.io (acq. Hugging Face), Kurve.ai, EmergentMethods.ai, KungFu.ai, and DataSpartan, and is lead committer for the `pytextrank` and `kglab` open source projects. Formerly: Director of Learning Group at O’Reilly Media; and Director of Community

Talk Abstract:
GraphRAG is a popular way to use knowledge graphs to ground AI apps in facts. Most GraphRAG tutorials use LLMs to build graph automatically from unstructured data. However, what if you’re working on use cases such as investigations and sanctions compliance — “catching bad guys” — where transparency for decisions and evidence are required?

This talk introduces how investigative practices leverage open data for AI apps, using _entity resolution_ to build graphs which are accountable. We’ll look at resources such as _Open Sanctions_ and _Open Ownership_, plus data models used to explore less-than-legit behaviors at scale, such as money laundering through anonymized offshore corporations. We’ll show SOTA open models used for components of this work, such as _named entity recognition_, _relation extraction_, _textgraphs_, and _entity linking_, and link to extended tutorials based on open source.

Talk: AI Tools Under Control: Keeping Your Agents Secure and Reliable

Presenter:
Bar Chen, Senior Software Engineer, Aporia

About the Speaker:
I’m a senior software engineer and product manager at Aporia, where I’ve been working for the past two years. I’ve spent nearly eight years in the tech industry, focusing on both cybersecurity and AI.

Talk Abstract:
This session focuses on AI tools and the importance of keeping them secure and reliable. We’ll discuss the main security challenges these tools face and share simple, practical solutions to address them. You’ll discover how using best practices can help protect your AI systems, reduce risks, and maximize their effectiveness.

Talk: Era of Multimodal AI & Reasoning

Presenter:
Ivan Nardini, Developer Relation Engineer, AI/ML, Google Cloud

About the Speaker:
Ivan Nardini is a Developer Relation Engineer on Google’s Cloud team, focusing on Artificial Intelligence and Machine Learning. He enables developers to build innovative AI and ML applications using their preferred libraries, models, and tools on Vertex AI, through code samples, online content, and events. Ivan has a master degree in Economics and Social Sciences from Università Bocconi and he attended a specialized training in Data Science from Barcelona Graduate School of Economics.

Talk Abstract:
The future of AI is multimodal. In this session, you will explore the importance of large context windows for effective reasoning over multi-modalities and how caching mechanisms, similar to human memory, can enhance performance. Also you will learn how large context windows and context caching unlock exciting new use cases, and understand why multimodal AI is crucial for building better systems.

Talk: Mastering Enterprise-Grade LLM Deployment: Overcoming Production Challenges

Presenter:
Jaeman An, Co-Founder & CEO, VESSL AI

About the Speaker:
Jaeman An graduated from KAIST with a degree in Electrical and Electronic Engineering. He has extensive expertise in DevOps and machine learning, having played a pivotal role in managing the DevOps infrastructure for the mobile game ‘Cookie Run,’ which achieved 10 million daily active users. He later joined a medical AI startup as VP of Engineering, where he successfully launched and operated various AI services. Through this experience, he identified inefficiencies in the machine learning development and operations processes and founded VESSL AI to address these challenges.

Talk Abstract:
This session delves into the practical challenges of deploying Large Language Models (LLMs) in production, particularly for enterprise applications. We’ll cover topics such as managing computational resources, optimizing model performance, ensuring data security, and adhering to compliance standards. The talk will also showcase strategies to mitigate these challenges, focusing on infrastructure management, latency reduction, and model reliability. Case studies from industries such as healthcare, finance, and e-commerce will illustrate how enterprises can safely and efficiently integrate LLMs into their existing systems.