Event Speakers

For Abstract, click on Speaker Card

Advanced Technical/Research

Business Strategy

Case Study

Workshop

Virtual Day

Panel Discussion

Future of AI

Speakers Corner

Lightning Talks

(more coming soon)

Tickets

Agenda

This agenda is still subject to changes.

Talk: Building State-of-the-Art Chatbot Using Open Source Models and Composite Systems

Presenter:
Urmish Thakker, Director of Machine Learning, SambaNova Systems

About the Speaker:
Urmish leads the LLM Team at SambaNova Systems. The LLM team at SambaNova focuses on understanding how to train and evaluate HHH aligned large language models, adapting LLMs to enterprise use-cases and HW-SW co-design of LLMs to enable efficient training and inference. Before SambaNova, he was in various engineering and research roles at Arm, AMD and Texas Instruments. He also helped drive the TinyML Performance Working Group in MLPerf, contributing to the development of key benchmarks for IoT ML. Urmish has 35+ publications and patents focussing on efficient deep learning and LLMs. His papers have been published at top ML and HW conferences like NeurIPS, ICLR and ISCA and has been an invited speaker at various top universities and industry academia summits. He completed his masters at the University of Wisconsin Madison and his bachelors from Birla Institute of Technology and Science.

Talk Track: Advanced Technical/Research

Talk Technical Level: 4/7

Talk Abstract:
Open source LLMs like LLAMA2 and BLOOM have enabled widespread development of enterprise LLM Applications. As the models adoption has matured over-time, we have seen a rise in LLMs specialized to solve narrow domains, tasks or modalities. By adopting such specialization, these models are able to outperform far larger proprietary or open models. For example, 7-70B llama experts like UniNER, TabLLAMA, NexusRaven, SambaCoder-nsql-llama2 etc can outperform GPT-4 on NER, Function Calling, Tabular data and Text2SQL tasks. Many more such examples exist and can be found in open source. However, one unique feature that larger proprietary models offer is a single end-point that takes an user query and provides a response. These responses can sometimes also include a chain of tasks that was solved to get to such a response.

The question we try to answer in this research is whether we can build a composite LLM system using open source checkpoints that can effectively provide this same usability as a larger proprietary model. This includes taking a user request and mapping it to a single checkpoint or a group of checkpoints that can solve the request and serve the user. Our work indicates that such a composite is indeed possible. We show this by building a new state-of-the-art model based on various adaptations of the mistral 7b model. Using unique ensembling methods, this composite model outperforms Gemma-7B, Mixtral-8x7B, llama2-70B,, Qwen-72B, Falcon-180B and BLOOM-176B at an effective inference cost of <10B parameter model.

What You’ll Learn
TBA

Talk: DL-Backtrace by AryaXAI: A Model Agnostic Explainability for Any Deep Learning Models (LLMs to CV)

Presenter:
Vinay Kumar Sankarapu, Founder & CEO, Arya.ai

About the Speaker:
Vinay Kumar Sankarapu is the Founder and CEO of Arya.ai. He did his Bachelor’s and Master’s in Mechanical Engineering at IIT Bombay with research in Deep Learning and published his thesis on CNNs in manufacturing. He started Arya.ai in 2013, one of the first deep learning startups, along with Deekshith, while finishing his Master’s at IIT Bombay.

He co-authored a patent for designing a new explainability technique for deep learning and implementing it in underwriting in FSIs. He also authored a paper on AI technical debt in FSIs. He wrote multiple guest articles on ‘Responsible AI’, ‘AI usage risks in FSIs’. He presented multiple technical and industry presentations globally – Nvidia GTC (SF & Mumbai), ReWork (SF & London), Cypher (Bangalore), Nasscom(Bangalore), TEDx (Mumbai) etc. He was the youngest member of ‘AI task force’ set up by the Indian Commerce and Ministry in 2017 to provide inputs on policy and to support AI adoption as part of Industry 4.0. He was listed in Forbes Asia 30-Under-30 under the technology section.

Talk Track: Advanced Technical/Research

Talk Technical Level: 4/7

Talk Abstract:
In today’s rapidly evolving AI landscape, deep learning models have become increasingly complex and opaque. They often function as “black boxes” that make decisions without transparent reasoning. This lack of explainability raises concerns in mission-critical applications where understanding the “why” behind a model’s decision is as important as the decision itself.

In this talk, we will introduce DL-Backtrace, a new technique designed at AryaXAI to explain any deep learning model—an LLM, traditional computer vision model, or beyond. We will discuss the algorithm and benchmarking results of DL-Backtrace against techniques like SHAPE, LIME, GradCAM, etc., for various DL models like LLMs (Llama 3.2), NLP (Bert), CV (ResNet), and tabular data.

What You’ll Learn:
The importance of explainability in mission-critical functions and model pruning; The current scope of explainability for deep learning models; complexities in scaling explainability for large models like LLMs; drawbacks with various current techniques; background about DL-Backrace; Results of the method for various DL models and subsequent work.

Talk: Can Long-Context LLMs Truly Use Their Full Context Window Effectively?

Presenter:
Lavanya Gupta, Senior Applied AI/ML Associate | CMU Grad | Gold Medalist | Tech Speaker, JPMorgan Chase & Co.

About the Speaker:
I am Lavanya, a graduate student from Carnegie Mellon University (CMU), Language Technologies Institute (LTI); and a passionate AI/ML industrial researcher with 5+ years of experience. I am also an avid tech speaker and have delivered several talks and participated in panel discussions at conferences like Women in Data Science (WiDS), Women in Machine Learning (WiML), PyData, TensorFlow User Group (TFUG). In addition, I am dedicated to providing mentorship via collaborations with multiple organizations like Anita Borg.

Talk Track: Advanced Technical/Research

Talk Technical Level: 6/7

Talk Abstract:
Recently there has been a growing interest in extending the context length (input window size) of large language models (LLMs), aiming to effectively process and reason over long input documents, as large as upto 128K tokens (ie. ~200 pages if a book). Long-context large language models (LC LLMs) promise to increase reliability of LLMs in real-world tasks. Most model provider benchmarks champion the idea that LC LLMs are getting better and smarter with time. However, these claims are far from perfect in real-world applications.

In this session, we evaluate the performance of state-of-the-art GPT-4 suite of LC LLMs in solving a series of progressively challenging tasks on a real-world financial news dataset, using an improvised version of the popular “needle-in-a-haystack” paradigm. We see that leading LC LLMs exhibit brittleness at longer context lengths even for simple tasks, with performance deteriorating sharply as task complexity increases. At longer context lengths, these state-of-the-art models experience catastrophic failures in instruction following resulting in degenerate outputs. Prompt ablations also expose the unfortunate continued sensitivity to both the placement of the task instruction in the context window as well as minor markdown formatting.

Overall, we will address the following questions in our session:
1. Does performance depend on the choice of prompting?
2. Can models reliably use their full context?
3. Does performance depend on the complexity of the underlying task?

What You’ll Learn
Key learnings:
1. Complete understanding of the popular “Needle-in-a-Haystack” paradigm
2. Learning the shortcomings of the traditional “Needle-in-a-Haystack” setup and how to improve it for real-world applications
3. Can state-of-the-art long-context LLMs truly use their full context window effectively?
4. Are models able to perform equally well at both short-context and long-context tasks?

Talk: Agentic AI: Learning Iteratively, Acting Autonomously

Presenter:
Fatma Tarlaci, CTO, Rastegar Capital

About the Speaker:
Dr. Fatma Tarlaci is a distinguished engineering leader with a wealth of experience in artificial intelligence, specializing in natural language processing. As the Chief Technology Officer at Rastegar Capital, she possesses a unique combination of technical expertise and leadership skills that distinguish her in the industry. She applies advanced AI solutions in her work that enhance business strategies and operational efficiencies through the integration of cutting-edge technologies. Her approach combines her deep expertise in AI with practical applications to solve complex challenges in the industry. Before entering the industry, she conducted research at OpenAI and taught in academia. Demonstrating her commitment to education, she currently also teaches as an Adjunct Assistant Professor of Computer Science at UT Austin. Her interdisciplinary background provides her with a refined ability to navigate diverse professional environments effectively. She mentors at several organizations and serves as an advisor to startups.

Talk Track: Advanced Technical or Research

Talk Technical Level: 3/7

Talk Abstract:
AI agents are advanced software systems capable of autonomous actions and decision-making to achieve specific goals. Built on sophisticated machine learning models, they can process and respond to dynamic data inputs in real-time. Unlike traditional large language models (LLMs) that generate outputs in a single attempt (zero-shot mode), agentic AI introduces an iterative, dynamic workflow. These workflows often involve multiple stages—planning, data gathering, drafting, assessment, and revision—significantly improving the quality of outcomes. This process mirrors human learning, where continual refinement leads to better results. Notably, iterative agentic workflows have recently shown impressive performance in tasks like coding, outperforming standard models on benchmarks such as HumanEval.

This presentation will provide an in-depth analysis of the architectures that underpin agentic AI, explore the cutting-edge technologies enabling their capabilities, and delve into their practical applications. It will also address key challenges in the field, such as scalability and ethical considerations, while exploring future directions. Attendees will gain a thorough understanding of AI agents, their current uses, and their potential to transform industries.

What You’ll Learn:
Attendees will learn about the fundamental architectures and technologies underpinning AI agents, their real-world applications, and the challenges they face, including ethical issues and scalability. The session will also explore future trends in AI development and its potential impact across various sectors.

Talk: Fast and Reproducible: Taming AI/ML Dependencies

Presenter:
Savin Goyal, Co-founder & CTO, Outerbounds

About the Speaker:
Savin is the co-founder and CTO of Outerbounds – where his team is building the modern ML stack to accelerate the impact of data science. Previously, he was at Netflix, where he built and open-sourced Metaflow, a full stack framework for data science.

Talk Track: Advanced Technical/Research

Talk Technical Level: 3/7

Talk Abstract:
Careful management of software dependencies is one of the most underrated parts of ML and AI systems despite being critically important for the stability of production deployments as well as for the speed of development. For the past many years, we have worked with the wider Python package management community (pip, conda, rattler, uv, and many more) and multiple organizations (Netflix, Amazon, Goldman Sachs, and many more) to advance the state of the art in dependency management for ML/AI platforms, including our open-source framework Metaflow.

In this talk, we’ll explore common pitfalls in dependency management and their impact on ML projects, from unexpected results due to package changes to the challenges of reproducing environments across different machines. We’ll cover issues ranging from the complexities of scaling dependencies in distributed cloud environments to performance regressions from seemingly innocuous updates, highlighting why robust dependency management is crucial for production ML systems.

We’ll share our learnings and demonstrate how we address key challenges in building robust and maintainable ML systems, such as:
Creating fast, stable, and reproducible environments for quick experimentation
Scaling cross-platform execution to the cloud with automatic dependency handling
Auditing the full software supply chain for security and compliance

We’ll also demo some of our recent work which enables baking very large container images in just a few seconds, significantly accelerating the prototyping and experimentation cycle for ML practitioners.

What You’ll Learn
This talk explores the critical yet often overlooked role of software dependency management in ML and AI systems. Drawing from years of collaboration with the Python package management community and major organizations, the speakers will share insights on common pitfalls in dependency management and their impact on ML projects and recent innovations in rapid container image creation for accelerated ML experimentation.

Talk: Revolutionizing Cloud Storage: From Petabytes to Intelligence

Presenter:
Vinit Dhatrak, Lead Software Engineer, DocuSign

About the Speaker:
Vinit is a seasoned software engineer with a demonstrated history of building on-premise and cloud-native distributed systems at scale. Currently, Vinit serves as a Lead Software Engineer at DocuSign, contributing to the Docusign’s Storage team. With expertise encompassing cloud storage, distributed systems, and virtualization technologies such as Kubernetes, Docker, and the Linux Kernel, Vinit stands out as a thought leader in the tech industry. Throughout his career, Vinit has held pivotal roles at notable companies like Google, Box, Commvault, and Marvell, where he played an instrumental role in developing highly scalable and distributed cloud storage solutions. His proficiency in object-oriented design and systems programming, coupled with his capability to scale infrastructures to handle concurrent requests and planet-scale storage, positions him as a true expert in his field. Vinit is an alumnus of the Georgia Institute of Technology, where he earned a Master’s degree in Computer Science. His technical acumen and leadership capabilities are evident in his ability to mentor peers and collaborate effectively with industry leaders. Recognized for his impactful contributions, Vinit frequently engages with the tech community, participating in conferences. His dedication to advancing technological solutions is not only evident in his professional experiences but also in his commitment to ongoing learning and development exemplified by his educational achievements and professional accomplishments. Stay connected with Vinit through his LinkedIn profile to gain insights from his extensive knowledge of scalable design and distributed systems, as he continues to innovate and lead in the ever-evolving landscape of technology.

Talk Track: Advanced Technical/Research

Talk Technical Level: 6/7

Talk Abstract:
In an era driven by exponential data growth and the need for intelligent insights, cloud storage solutions must evolve to meet the dynamic demands of modern enterprises. This talk delves into the intricate process of migrating traditional on-premises blob storage systems to cutting-edge cloud platforms like Azure while integrating AI-powered insights to enhance cloud software offerings.

We will explore the architecture and implementation challenges faced while leading the Blob Storage team at DocuSign, focusing on how AI technologies were harnessed to transform data management within the Intelligent Agreement Management (IAM) platform. This initiative did not just facilitate a seamless transition but also pioneered a new category in cloud software, significantly bolstering market leadership.

Attendees will gain insights into optimizing resource utilization, achieving cost efficiency, and ensuring scalability in cloud migrations, drawing from a successful case implementing an intelligence-enabled cloud ecosystem. Furthermore, the talk will illuminate how AI and machine learning models were leveraged to provide actionable insights, assisting in strategic decision-making and enhancing user engagement.

The session will cover critical lessons learned, including identity and data security in cloud transformations, effective use of REST APIs for integration, and the deployment of microservices for agile and scalable services. Participants will leave equipped with advanced strategies to align their cloud migration efforts with organizational goals, optimize resources, and drive innovation through AI. Join us as we explore the convergence of cloud and artificial intelligence, unlocking new potentials in data storage solutions.

What You’ll Learn:
You’ll learn how to migrate on-premises blob storage to cloud platforms like Azure, focusing on DocuSign’s experience. We’ll explore the architectural and implementation challenges, highlighting how AI-powered insights were integrated into the Intelligent Agreement Management (IAM) platform. The talk will cover optimizing resource utilization and achieving cost efficiency during cloud migrations, using successful case studies. you’ll gain practical strategies for aligning cloud migrations with organizational goals, fostering innovation through AI-driven insights, and maximizing user engagement.

Talk: Driving GenAI Success in Production: Proven Approaches for Data Quality, Context, and Logging

Presenter:
Alison Cossette, Developer Advocate, Neo4j

About the Speaker:
Alison Cossette is a dynamic Data Science Strategist, Educator, and Podcast Host. As a Developer Advocate at Neo4j specializing in Graph Data Science, she brings a wealth of expertise to the field. With her strong technical background and exceptional communication skills, Alison bridges the gap between complex data science concepts and practical applications. Alison’s passion for responsible AI shines through in her work. She actively promotes ethical and transparent AI practices and believes in the transformative potential of responsible AI for industries and society. Through her engagements with industry professionals, policymakers, and the public, she advocates for the responsible development and deployment of AI technologies. She is currently a Volunteer Member of the US Department of Commerce – National Institute of Standards and Technology’s Generative AI Public Working Group Alison’s academic journey includes Masters of Science in Data Science studies, specializing in Artificial Intelligence, at Northwestern University and research with Stanford University Human-Computer Interaction Crowd Research Collective. Alison combines academic knowledge with real-world experience. She leverages this expertise to educate and empower individuals and organizations in the field of data science. Overall, Alison Cossette’s multifaceted background, commitment to responsible AI, and expertise in data science make her a respected figure in the field. Through her role as a Developer Advocate at Neo4j and her podcast, she continues to drive innovation, education, and responsible practices in the exciting realm of data science and AI.

Talk Track: Advanced Technical/Research

Talk Technical Level: 2/7

Talk Abstract:
Generative AI is a part of our every day work now, but folks are still struggling to realize business value in production.

Key Themes:

Methodical Precision in Data Quality and Dataset Construction for RAG Excellence: Uncover an integrated methodology for refining, curating, and constructing datasets that form the bedrock of transformative GenAI applications. Specifically, focus on the six key aspects crucial for Retrieval-Augmented Generation (RAG) excellence.

Navigating Non-Semantic Context with Awareness: Explore the infusion of non-semantic context through graph databases while understanding the nuanced limitations of the Cosine Similarity distance metric. Recognize its constraints in certain contexts and the importance of informed selection in the quest for enhanced data richness.

The Logging Imperative: Recognize the strategic significance of logging in the GenAI landscape. From application health to profound business insights, discover how meticulous logging practices unlock valuable information and contribute to strategic decision-making.

Key Takeaways:

6 Requirements for GenAI Data Quality

Adding non-semantic context, including an awareness of limitations in distance metrics like Cosine Similarity.

The strategic significance of logging for application health and insightful business analytics.

Join us on this methodologically rich exploration, “Beyond Vectors,” engineered to take your GenAI practices beyond the current Vector Database norms, unlocking a new frontier in GenAI evolution with transformative tools and methods!

What You’ll Learn
TBA

Talk: AI Features Demand Evidence-Based Decisions

Presenter:
Connor Joyce, Senior User Researcher, Microsoft and Author of “Bridging Intentions to Impact”

About the Speaker:
Connor Joyce is the author of “Bridging Intentions to Impact” and a Senior User Researcher on the Microsoft Copilot Team, where he is advancing the design of AI-enhanced features. Passionate about driving meaningful change, Connor advocates that companies adopt an Impact Mindset, ensuring that products not only change behavior to satisfy user needs but also drive positive business outcomes. He is a contributor to numerous publications, advises emerging startups, and lectures at the University of Pennsylvania. Based in Seattle, Connor enjoys exploring the outdoors with his dog, Chai, and a local event organizer.

Talk Track: Business Strategy

Talk Technical Level: 5/7

Talk Abstract:
We are in the midst of a technology paradigm shift, and there is significant pressure on product teams to build Generative AI (GenAI) into their products. Navigating these uncharted waters requires decisions based on a deep understanding of user needs to ensure that this new technology is leveraged in the most beneficial way for both users and the business. This presentation emphasizes the necessity of creating a demand for insights by product teams and the democratization of evidence creation. Doing both can be achieved by defining features in a way that highlights the evidence supporting why they should work. By using the novel User Outcome Connection, teams can naturally identify what data is known and unknown about a feature. This framework makes the pursuit of new research to fill the gaps more straightforward, ensuring a solid foundation for decision-making.

By developing User Outcome Connection frameworks for key features, teams can design solutions that appropriately and effectively incorporate GenAI. This will be showcased through B2B and B2C examples illustrating the practical application and transformative potential of this approach.

What You’ll Learn
Attendees will learn how using the User Outcome Connection framework for key features, enables the strategic use of GenAI where it truly adds value. By the end of this session, participants will be equipped with actionable steps to adopt evidence-based frameworks, ensuring their products meet the evolving demands of technology and user expectations. Join this session to learn how to navigate the AI paradigm shift with evidence-based decisions and design truly impactful AI-enhanced features.

Talk: On-Device ML for LLMs: Post-training Optimization Techniques with T5 and Beyond

Presenter:
Sri Raghu Malireddi, Senior Machine Learning Engineer, Grammarly

About the Speaker:
Sri Raghu Malireddi is a Senior Machine Learning Engineer at Grammarly, working on the On-Device Machine Learning. He specializes in deploying and optimizing Large Language Models (LLMs) on-device, focusing on improving system performance and algorithm efficiency. He has played a key role in the on-device personalization of the Grammarly Keyboard. Before joining Grammarly, he was a Senior Software Engineer and Tech Lead at Microsoft, working on several key initiatives for deploying machine learning models in Microsoft Office products.

Talk Track: Advanced Technical/Research

Talk Technical Level: 4/7

Talk Abstract:
This session explores the practical aspects of implementing Large Language Models (LLMs) on devices, focusing on models such as T5 and its modern variations. Deploying ML models on devices presents significant challenges due to limited computational resources and power constraints. However, On-Device ML is crucial as it reduces dependency on cloud services, enhances privacy, and lowers latency.

Optimizing LLMs for on-device deployment requires advanced techniques to balance performance and efficiency. Grammarly is at the forefront of On-Device ML, continuously innovating to deliver high-quality language tools. This presentation offers valuable insights for anyone interested in the practical implementation of on-device machine learning using LLMs, drawing on Grammarly’s industry application insights.

The topics that will be covered as part of this talk are –
– Techniques for optimizing performance and reducing inference latency in LLMs – Quantization, Pruning, Layer Fusion, etc.
– Methods to develop efficient and scalable AI solutions on edge devices.
– Addressing common challenges in deploying LLMs to edge devices – over-the-air updates, logging, and debugging issues in production.

Foundation models (e.g. large language models) create exciting new opportunities in our longstanding quests to produce open-ended and AI-generating algorithms, wherein agents can truly keep innovating and learning forever. In this talk I will share some of our recent work harnessing the power of foundation models to make progress in these areas. I will cover our recent work on OMNI (Open-endedness via Models of human Notions of Interestingness), Video Pre-Training (VPT), Thought Cloning, Automatically Designing Agentic Systems, and The AI Scientist.

What You’ll Learn
TBA

Talk: Multimodal LLMs for Product Taxonomy at Shopify

Presenter:
Kshetrajna Raghavan, Senior Staff ML Engineer, Shopify

About the Speaker:
With over 12 years of industry experience spanning healthcare, ad tech, and retail, Kshetrajna Raghavan has spent the last four years at Shopify building cutting-edge machine learning products that make life easier for merchants. From Product Taxonomy Classification to Image Search and Financial Forecasting, Kshetrajna has tackled a variety of impactful projects. Their favorite? The Product Taxonomy Classification model, a game-changer for Shopify’s data infrastructure and merchant tools.

Armed with a Master’s in Operations Research from Florida Institute of Technology, Kshetrajna brings a robust technical background to the table.

When not diving into data, Kshetrajna loves jamming on guitars, tinkering with electric guitar upgrades, hanging out with two large dogs, and conquering video game worlds.

Talk Track: Case Study

Talk Technical Level: 4/7

Talk Abstract:
At Shopify we fine-tune and deploy large vision language models in production to make millions of predictions a day, and leverage different open source tooling to achieve this.
In this talk we walkthrough how we went about doing it for a generative ai use case at Shopify’s scale.

What You’ll Learn
a. Getting to Know Vision Language Models:

The Basics: We’ll kick things off with a quick rundown of what vision language models are and how they work.
Cool Uses: Dive into some awesome ways these models are being used in e-commerce, especially at Shopify.
b. Fine-Tuning and Deployment:

Tweaking the Models: Learn the ins and outs of fine-tuning these big models for specific tasks.
Going Live: Tips and tricks for deploying these models so they can handle millions of predictions every day without breaking a sweat.
c. Open Source Tools:

Tool Talk: How to pick the right open-source tools for different stages of your model journey.
Smooth Integration: Real-life examples of how we fit these tools into our workflows at Shopify.
d. Scaling Up and Speeding Up:

Scaling Challenges: The hurdles we faced when scaling these models and how we jumped over them.
Speed Boosts: Techniques to keep things running fast and smooth in a production setting.
e. Generative AI Case Study:

Deep Dive: A step-by-step look at a specific generative AI project we tackled at Shopify, from start to finish.
Key Takeaways: What we learned along the way and how you can apply these lessons to your own projects.

Talk: Toyota's Generative AI Journey

Presenter:
Ravi Chandu Ummadisetti, Generative AI Architect, Toyota

About the Speaker:
Ravi Chandu Bio (Generative AI Architect): Ravi Chandu Ummadisetti is a distinguished Generative AI Architect with over a decade of experience, known for his pivotal role in advancing AI initiatives at Toyota Motor North America. His expertise in AI/ML methodologies has driven significant improvements across Toyota’s operations, including a 75% reduction in production downtime and the development of secure, AI-powered applications. Ravi’s work at Toyota, spanning manufacturing optimization, legal automation, and corporate AI solutions, showcases his ability to deliver impactful, data-driven strategies that enhance efficiency and drive innovation. His technical proficiency and leadership have earned him recognition as a key contributor to Toyota’s AI success.

Kordel France Bio (AI Architect): Kordel brings a diverse background of experiences in robotics and AI from both academia and industry. He has multiple patents in advanced sensor design and spent much of the past few years founding and building a successful sensor startup that enables the sense of smell for robotics. He is on the board of multiple startups and continues to further his AI knowledge as an AI Architect at Toyota.

Eric Swei Bio (Senior Generative AI Architect): Boasting an impressive career spanning over two decades, Eric Swei is an accomplished polymath in the tech arena, with deep-seated expertise as a full stack developer, system architect, integration architect, and specialist in computer vision, alongside his profound knowledge in generative AI, data science, IoT, and cognitive technologies.

At the forefront as the Generative AI Architect at Toyota, Eric leads a formidable team in harnessing the power of generative AI. Their innovative endeavors are not only enhancing Toyota’s technological prowess but also redefining the future of automotive solutions with cutting-edge AI integration.

Stephen Ellis Bio (Technical Generative AI Product Manager): 10 years of experience in research strategy and the application of emerging technologies for companies as small as startups to Fortune 50 Enterprises. Former Director of the North Texas Blockchain Alliance where leading the cultivation of the Blockchain and Cryptocurrency competencies among software developers, C-level executives, and private investment advisors. Formerly the CTO of Plymouth Artificial Intelligence which was researching and developing future applications of AI. In this capacity advised companies on building platforms that seek to leverage emerging technologies for new business cases. Currently Technical Product Manager at Toyota Motors North America focused on enabling generative AI solutions for various group across the enterprise to drive transformation in developing new mobility solutions and enterprise operations.

Talk Track: Case Study

Talk Technical Level: 2/7

Talk Abstract:
Team Toyota will delve into their innovative journey with generative AI in automotive design, with the talk exploring how the Toyota research integrates traditional engineering constraints with state-of-the-art generative AI techniques, enhancing designers’ capabilities while ensuring safety and performance considerations.

What You’ll Learn
1. Toyota’s Innovation Legacy
2. Leveraging LLMs in Automotive – battery, vehicle, manufacturing, etc
3. Failures in Generative AI projects
4. Education to business stakeholders

Talk: A Data Scientist Guide to Unit & End to End Testing

Presenter:
Vatsal Patel, Senior Data Scientist, MongoDB

About the Speaker:
Vatsal Patel, Senior Data Scientist, MongoDB

Talk Track: Workshop

Talk Technical Level: 4/7

Talk Abstract:
A comprehensive guide designed to equip data scientists with essential knowledge and practical skills for testing their developed and deployed models.

Key Topics Covered:
– Why Testing is Crucial in ML: Understand the importance of testing in the machine learning lifecycle and how it ensures model reliability and performance.

– Test-Driven Development: Learn about the TDD methodology, its benefits, and how it encourages writing clean, maintainable code by defining tests before implementing the functionality.

– Tools for Testing ML Models: Explore the tools available for unit testing, end-to-end testing, and CI/CD integration, such as `unittest`, `pytest`, `drone`, `GitHub Actions` etc.

– Unit Testing:
– Basic understanding of what unit testing is and its importance in verifying individual components of the ML pipeline.
– Best Practices: Identify best practices on how to approach testing, write test cases, and implement tests using popular frameworks.
– Tutorial: Practical examples illustrating how to write and run unit tests for data preprocessing functions and modelling.
– Running Tests: Instructions on running unit tests locally using `pytest`.

– End-to-End (E2E) Testing:
– Basic understanding of end-to-end testing, which validates the entire ML workflow from data ingestion to model serving.
Dependency Injection: Understand dependency injection and how it helps isolate components and create flexible test configurations.
– Best Practices: Best practices on defining workflows, writing test cases for critical paths, and implementing tests using E2E frameworks.
– Tutorial: Example of writing end-to-end tests for a scoring/training pipeline.
– Running Tests: Guidance on executing E2E tests locally using `pytest`.

Integration to CI/CD Pipelines: Learn how to automate model testing and deployment by integrating unit and end-to-end tests into CI/CD pipelines, ensuring continuous validation of code changes. This part will leverage Makefile & go over code coverage.

This presentation is ideal for data scientists, machine learning engineers, and anyone developing and deploying ML models who want to enhance their testing practices. By the end of the session, attendees will have a solid understanding of unit and end-to-end testing principles, practical examples to follow, and the confidence to implement these testing strategies in their projects.

What You’ll Learn:
TBA

Talk: Building Agentic and Multi-Agent Systems with LangGraph

Presenters:
Greg Loughnane, Co-Founder, AI Makerspace | Chris Alexiuk, Co-Founder & CTO, AI Makerspace

About the Speaker:
Dr. Greg Loughnane is the Co-Founder & CEO of AI Makerspace, where he is an instructor for their AI Engineering Bootcamp. Since 2021 he has built and led industry-leading Machine Learning education programs. Previously, he worked as an AI product manager, a university professor teaching AI, an AI consultant and startup advisor, and an ML researcher. He loves trail running and is based in Dayton, Ohio.

Chris Alexiuk is the Co-Founder & CTO at AI Makerspace, where he is an instructor for their AI Engineering Bootcamp. Previously, he was a Founding Machine Learning Engineer, Data Scientist, and ML curriculum developer and instructor. He’s a YouTube content creator YouTube who’s motto is “Build, build, build!” He loves Dungeons & Dragons and is based in Toronto, Canada.

Talk Track: Workshop

Talk Technical Level: 4/7

Talk Abstract:
2024 is the year of agents, agentic RAG, and multi-agent systems!

This year, people and companies aim to build more complex LLM applications and models; namely, ones that are ever-more capable of leveraging context and reasoning. For applications to leverage context well, they must provide useful input to the context window (e.g., in-context learning), through direct prompting or search and retrieval (e.g., Retrieval Augmented Generation or RAG. To leverage reasoning is to leverage the Reasoning-Action ReAct pattern, and to be “agentic” or “agent-like.” Another way to think about agents is that they enhance search and retrieval through the intelligent use of tools or services.

The best practice tool in the industry for building complex LLM applications is LangChain. To build agents as part of the LangChain framework, we leverage LangGraph, which allows us to bake in cyclical reasoning loops to our application logic. LangChain v0.2, the latest version of the leading infrastructure orchestration tooling, incorporates LangGraph directly, the engine that powers stateful (and even fully autonomous) agent cycles.

In this session, we’ll break down all the concepts and code you need to understand and build the industry-standard agentic and multi-agent systems, from soup to nuts.

What You’ll Learn
– A review of the basic prototyping patterns of GenAI, including Prompt Engineering, RAG, Fine-Tuning, and Agents
– The core ideas and constructs to build agentic and multi-agent applications with LangGraph
– ⛓️ Build custom agent applications with LangGraph
– 🤖 Develop multi-agent workflows with LangGraph

Talk: Hands-on Scalable Edge-to-Core ML Pipelines

Presenters:
Debadyuti Roy Chowdhury, VP Products, InfinyOn | Sehyo Chang, CTO, InfinyOn

About the Speaker:
Deb leads product management at InfinyOn a distributed streaming infrastructure company. Deb’s career since 2006 spans across IT, server administration, software and data engineering, leading data science and AI practices, and product management in HealthTech, Public Safety, Manufacturing, and Ecommerce.

Sehyo Chang is the CTO and Co-founder of InfinyOn. He is also the creator of the Fluvio open-source project. He dabbled in WASM technology at an early stage and spearheaded InfinyOn to join Bytecode Alliance. He is a veteran of the open-source business model. Previously he was at NGINX, where he developed nginmesh and Rust binding for NGINX.

Talk Track: Workshop

Talk Technical Level: 7/7

Talk Abstract:
In this intensive workshop, participants will gain hands-on experience in designing, implementing, and troubleshooting a real-world distributed ML pipeline that spans from edge devices to core infrastructure. We’ll tackle key MLOps challenges in building and managing complex, scalable systems for both operational analytics and AI/ML workflows.
Key topics covered:

Edge Computing: Simulating data ingestion from edge devices
Streaming Architecture: Implementing real-time data flows with open-source tools
Distributed Processing: Scaling ML workloads across heterogeneous environments
Model Deployment: Strategies for serving models at the edge and in the cloud
Observability and Monitoring: Setting up comprehensive monitoring for distributed ML systems
MLOps Best Practices: Applying DevOps principles to ML lifecycle management

Hands-on activities:

Participants will work in small groups to build a complete edge-to-core ML pipeline
Each team will deploy a pre-trained model for real-time inference at the edge
Groups will implement data validation and model monitoring across the pipeline
Participants will troubleshoot common issues in distributed ML systems

What You’ll Learn
Attendees will gain hands-on experience in designing, implementing, and troubleshooting a real-world distributed dataflow spanning operational analytics and AI/ML pipelines.

– Practical experience in designing scalable, distributed ML architectures
– Understanding of MLOps challenges in edge-to-core systems
– Hands-on skills in deploying and monitoring ML models across diverse environments
– Strategies for optimizing performance and resource usage in complex ML pipelines
– Best practices for maintaining data quality and model accuracy in production systems

Talk: Hemm: Holistic Evaluation of Multi-modal Generative Models

Presenter:
Anish Shah, ML Engineer, Weights & Biases

About the Speaker:
Join Anish Shah for an in-depth session on fine-tuning and evaluating multimodal generative models. This talk will delve into advanced methodologies for optimizing text-to-image diffusion models, with a focus on enhancing image quality and improving prompt comprehension.
Learn how to leverage Weights & Biases for efficient experiment tracking, enabling seamless monitoring and analysis of your model’s performance.

Additionally, discover how to utilize Weave, a lightweight toolkit for tracking and evaluating LLM applications, to conduct practical and holistic evaluations of multimodal models.

The session will also introduce Hemm, a comprehensive library for benchmarking text-to-image diffusion models on image quality and prompt comprehension, integrated with Weights & Biases and Weave. By the end of this talk, you’ll be equipped with cutting-edge tools and techniques to elevate your multimodal generative models to the next level.

Talk Track: Virtual Workshop

Talk Technical Level: 3/7

Talk Abstract:
Join Anish Shah for an in-depth session on fine-tuning and evaluating multimodal generative models. This talk will delve into advanced methodologies for optimizing text-to-image diffusion models, with a focus on enhancing image quality and improving prompt comprehension.
Learn how to leverage Weights & Biases for efficient experiment tracking, enabling seamless monitoring and analysis of your model’s performance.

Additionally, discover how to utilize Weave, a lightweight toolkit for tracking and evaluating LLM applications, to conduct practical and holistic evaluations of multimodal models.

What You’ll Learn:
Advanced Fine-Tuning Techniques: Explore methods for fine-tuning text-to-image diffusion models to enhance image quality and prompt comprehension.
Optimizing Image Quality: Understand the metrics and practices for assessing and improving the visual fidelity of generated images.
Enhancing Prompt Comprehension: Learn how to ensure your models accurately interpret and respond to complex textual prompts.
Utilizing Weights & Biases: Gain hands-on experience with Weights & Biases for tracking experiments, visualizing results, and collaborating effectively.
Leveraging Weave: Discover how Weave can be used for lightweight tracking and evaluation of LLM applications, providing practical insights into model performance.
Introduction to Hemm: Get acquainted with Hemm and learn how it facilitates comprehensive benchmarking of text-to-image diffusion models.
Holistic Model Evaluation: Learn best practices for conducting thorough evaluations of multimodal models, ensuring they meet desired performance standards across various metrics.

Talk: From Black Box to Mission Critical: Implementing Advanced AI Explainability and Alignment in FSIs

Presenter:
Vinay Kumar Sankarapu, Founder & CEO, Arya.ai

Talk Track: Virtual Workshop

Talk Technical Level: 4/7

Talk Abstract:
In highly regulated industries like FSIs, there are more stringent policies regarding the use of ‘ML Models’ in production. To gain acceptance from all stakeholders, multiple additional criteria are required in addition to model performance.

This workshop will discuss the challenges of deploying ML and the stakeholders’ requirements in FSIs. We will review the sample setup in use cases like claim fraud monitoring and health claim processing, along with the case study details of model performance and MLOps architecture iterations.

The workshop will also discuss the AryaXAI MLObservability competition specifications and launch details.

What You’ll Learn:
In this workshop, you will gain a comprehensive understanding of the expectations of FSIs while deploying machine learning models. We’ll explore the additional criteria beyond model performance essential for gaining acceptance from various stakeholders, including compliance officers, risk managers, and business leaders. We’ll delve into how AI explainability outputs must be iterated for multiple stakeholders and how alignment is implemented through real-world case studies in claim fraud monitoring and health claim processing. You’ll also gain insights into why the iterative process of developing MLOps architectures is needed to meet performance and compliance requirements.

Talk: Fast Data Loading for Deep Learning Workloads with lakeFS Mount

Presenter:
Amit Kesarwani, Director, Solution Engineering, lakeFS

About the Speaker:
Amit heads the solution architecture group at Treeverse, the company behind lakeFS, an open-source platform that delivers a Git-like experience to object-storage based data lakes.
Amit has 30+ years of experience as a technologist working with Fortune 100 companies as well as start-ups. Designing and implementing technical solutions for complicated business problems.
As an entrepreneur, he launched a cloud offering to provide Data Warehouse as a Service. Amit holds a Master’s certificate in Project Management from George Washington University and a bachelor’s degree in Computer Science and Technology from Indian Institute of Technology (IIT), India. He is the inventor of the patent: System and Method for Managing and Controlling Data

Talk Track: Virtual Talk

Talk Technical Level: 6/7

Talk Abstract:
Working with large datasets locally allows for a lot more control in your executions and workflows mainly for AI and Deep Learning workloads.

However, this can present a number of tradeoffs that lakeFS Mount helps solve:
• 𝗚𝗶𝘁 𝗶𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗶𝗼𝗻 – Mounting a path in a Git repo automatically tracks the data version, linking it with your code. When checking older code versions, you get the corresponding data version, preventing local-only successes.

• 𝗦𝗽𝗲𝗲𝗱 – Data consistency and performance are guaranteed. lakeFS prefetches commit metadata into a local cache in sub-milliseconds, allowing you to work immediately without having to wait for large dataset downloads.
Intelligent – lakeFS Mount efficiently uses cache, accurately predicting which objects will be accessed. This enables granular pre-fetching for metadata and data files before processing starts.

• 𝗖𝗼𝗻𝘀𝗶𝘀𝘁𝗲𝗻𝗰𝘆 – Working locally risks using outdated or incorrect data versions. With Mount, you can work with consistent, immutable versions, ensuring you know exactly what data version you’re using.

What You’ll Learn
With lakeFS Mount, you can transparently mount an object store reference as a local directory (yes, even at petabyte-scale), while avoiding the common pitfalls typically associated with trying to access an object store as a filesystem.

In this talk, you will learn about lakeFS Mount and you will also see a demonstration of:
• Training a TensorFlow predictive model on data mounted using lakeFS Mount
• Integration with Git to version code and data together
• Reproducibility of code as well as data

Talk: HybridRAG: Merging Knowledge Graphs with Vector Retrieval for Efficient Information Extraction

Presenter:
Bhaskarjit Sarmah, Vice President, BlackRock

About the Speaker:
As a Vice President and Data Scientist at BlackRock, I apply my machine learning skills and domain knowledge to build innovative solutions for the world’s largest asset manager. I have over 10 years of experience in data science, spanning multiple industries and domains such as retail, airlines, media, entertainment, and BFSI.

At BlackRock, I am responsible for developing and deploying machine learning algorithms to enhance the liquidity risk analytics framework, identify price-making opportunities in the securities lending market, and create an early warning system using network science to detect regime change in markets. I also leverage my expertise in natural language processing and computer vision to extract insights from unstructured data sources and generate actionable reports. My mission is to use data and technology to empower investors and drive better financial outcomes.

Talk Track: Virtual Talk

Talk Technical Level: 7/7

Talk Abstract:
In this session we will introduce HybridRAG, a novel approach that combines Knowledge Graphs (KGs) and Vector Retrieval Augmented Generation (VectorRAG) to improve information extraction from financial documents. HybridRAG addresses challenges in analyzing financial documents, such as domain-specific language and complex data formats, which traditional RAG methods often struggle with. By integrating Knowledge Graphs, HybridRAG provides a structured representation of financial data, thereby enhancing the accuracy and relevance of the generated answers. Experimental results demonstrate that HybridRAG outperforms both VectorRAG and GraphRAG individually in terms of retrieval accuracy and answer generation.

What You’ll Learn
Key learnings from this session will include an understanding of the integration of Knowledge Graphs (KGs) and Vector Retrieval Augmented Generation (VectorRAG) to enhance information extraction from financial documents. The paper addresses challenges posed by domain-specific language and complex data formats in financial documents, which are often not well-handled by general-purpose language models. The HybridRAG approach demonstrates improved retrieval accuracy and answer generation compared to using VectorRAG or GraphRAG alone, highlighting its effectiveness in generating contextually relevant answers. Although the focus is on financial documents, the techniques discussed have broader applications, offering insights into the wider utility of HybridRAG beyond the financial domain.

Talk: Revolutionizing the skies: Mlops case study of LATAM airlines

Presenters:
Michael Haacke Concha, MLOps Lead, LATAM Airlines
Diego Castillo Warnken, Staff Machine Learning Engineer, LATAM Airlines

About the Speaker:
Michael Haacke Concha is the Lead Machine Learning Engineer of the centralized MLOps team at LATAM Airlines. He holds both a Bachelor’s and a Master’s degree in Theoretical Physics from Pontificia Universidad Católica de Chile (PUC). Over his three years at LATAM Airlines, he developed an archival and retrieval system for black box data of the aircraft to support analytics. He then played a key role in building the framework for integrating the Iguazio MLOps platform within the company. In the past year, he has been leading the development of a new platform using Vertex GCP.

Prior to joining LATAM Airlines, Michael worked as a data scientist on the ATLAS experiment at the Large Hadron Collider (LHC), where he contributed to various studies, including the search for a long-lived Dark Photon and a Heavy Higgs.

Diego Castillo is a Consultant Machine Learning Engineer at Neuralworks, currently on assignment as Staff in LATAM Airlines, where he plays a pivotal role within the decentralized Data & AI Operations team. A graduate of the University of Chile with a degree in Electrical Engineering, Diego has excelled in cross-functional roles, driving the seamless integration of machine learning models into large-scale production environments. As a Staff Machine Learning Engineer at LATAM, he not only leads and mentors other MLEs but also shapes the technical direction across key business areas.

Throughout his career at LATAM Airlines, Diego has significantly impacted diverse domains, including Cargo, Customer Care and the App and Landing Page teams. He has more recently been supporting the migration of the MLOPS internal framework from Iguazio to Vertex GCP.

With a comprehensive expertise spanning the entire machine learning lifecycle, Diego brings a wealth of experience from previous roles, including Data Scientist, Backend Developer, and Data Engineer, making him a versatile leader in the AI space.

Talk Track: Applied Case Studies

Talk Technical Level: 2/7

Talk Abstract:
This talk explores how LATAM Airlines leveraged MLOps to revolutionize their operations and achieve financial gain in the hundred of millions of dollars. By integrating machine learning models into their daily workflows and automating the deployment and management processes, LATAM Airlines was able to optimize tariffs, enhance customer experiences, and streamline maintenance operations. The talk will highlight key MLOps strategies employed, such as continuous integration and delivery of ML models, real-time data processing. Attendees will gain insights into the tangible benefits of MLOps, including cost savings, operational efficiencies, and revenue growth, showcasing how strategic ML operations can create substantial value in the airline industry.

What You’ll Learn
You will acquire insight into how a scalable and decentralized tech team grows inside LATAM airlines, thanks to technology and organizational structure. also you will learn some of our successful use cases of our MLOps ecosystem.

Talk: ML Deployment at Faire: Predicting the Future, Serving the Present

Presenter:
Harshit Agarwal, Senior Machine Learning Engineer, Faire Wholesale Inc

About the Speaker:
How Faire transitioned a traditional infrastructure into a modern, flexible model deployment and serving stack that supports a range of model types, while ensuring operational excellence and scalability in a dynamic e-commerce environment.

Over the past few years at Faire, we have overhauled our ML serving infrastructure, moving from hosting XGBoost models in a monolithic service to a flexible and powerful ML deployment and serving stack that powers all types of models, small and big.

In this talk, we’ll cover how we set up a system that makes it easy to migrate, deploy, scale, and manage different types of models. Key points will include how we set up infrastructure as code and CI/CD pipelines for smooth deployment, automated testing, and created user-friendly tools for managing model releases. We’ll also touch on how we built in observability and monitoring to keep an eye on model performance and reliability.

Come and learn how Faire’s ML serving stack helps our team quickly bring new ideas to life, while also maintaining the operational stability needed for a growing marketplace.

Talk Track: Research or Advanced Technical

Talk Technical Level: 5/7

Talk Abstract:
How Faire transitioned a traditional infrastructure into a modern, flexible model deployment and serving stack that supports a range of model types, while ensuring operational excellence and scalability in a dynamic e-commerce environment.

Come and learn how Faire’s ML serving stack helps our team quickly bring new ideas to life, while also maintaining the operational stability needed for a growing marketplace.

What You’ll Learn
1. How to best structure an ML serving and deployment infrastruture
2. How to build testing and observability into your deployment and serving infra
3. How to build production grade tools that your data scientists and MLEs will love
4. See how we are serving users at scale and the design choices that we made

Talk: Memory Optimizations for Machine Learning

Presenter:
Tejas Chopra, Senior Software Engineer, Netflix

About the Speaker:
Tejas Chopra is a Senior Software Engineer, working in the Data Storage Platform team at Netflix, where he is responsible for architecting storage solutions to support Netflix Studios and Netflix Streaming Platform. Prior to Netflix, Tejas was working on designing and implementing the storage infrastructure at Box, Inc. to support a cloud content management platform that scales to petabytes of storage & millions of users. Tejas has worked on distributed file systems & backend architectures, both in on-premise and cloud environments as part of several startups in his career. Tejas is an International Keynote Speaker and periodically conducts seminars on Micro services, NFTs, Software Development & Cloud Computing and has a Masters Degree in Electrical & Computer Engineering from Carnegie Mellon University, with a specialization in Computer Systems.

Talk Track: Research or Advanced Technical

Talk Technical Level: 5/7

Talk Abstract:
As Machine Learning continues to forge its way into diverse industries and applications, optimizing computational resources, particularly memory, has become a critical aspect of effective model deployment. This session, “”Memory Optimizations for Machine Learning,”” aims to offer an exhaustive look into the specific memory requirements in Machine Learning tasks, including Large Language Models (LLMs), and the cutting-edge strategies to minimize memory consumption efficiently.

We’ll begin by demystifying the memory footprint of typical Machine Learning data structures and algorithms, elucidating the nuances of memory allocation and deallocation during model training phases. The talk will then focus on memory-saving techniques such as data quantization, model pruning, and efficient mini-batch selection. These techniques offer the advantage of conserving memory resources without significant degradation in model performance.
A special emphasis will be placed on the memory footprint of LLMs during inferencing. LLMs, known for their immense size and complexity, pose unique challenges in terms of memory consumption during deployment. We will explore the factors contributing to the memory footprint of LLMs, such as model architecture, input sequence length, and vocabulary size. Additionally, we will discuss practical strategies to optimize memory usage during LLM inferencing, including techniques like model distillation, dynamic memory allocation, and efficient caching mechanisms.
By the end of this session, attendees will have a comprehensive understanding of memory optimization techniques for Machine Learning, with a particular focus on the challenges and solutions related to LLM inferencing.

What You’ll Learn
By the end of this session, attendees will have a comprehensive understanding of memory optimization techniques for Machine Learning, including: pruning, quantization, distillation, etc. and where to apply them. They will also learn about how to implement these techniques using pytorch.

Talk: From Black Box to Glass Box: Interpreting your Model

Presenter:
Zachary Carrico, Senior Machine Learning Engineer, Apella

About the Speaker:
Zac is a Senior Machine Learning Engineer at Apella, specializing in machine learning products for improving surgical operations. He has a deep interest in healthcare applications of machine learning, and has worked on cancer and Alzheimer’s disease diagnostics. He has end-to-end experience developing ML systems: from early research to serving thousands of daily customers. Zac is an active member of the ML community, having presented at conferences such as Ray Summit, TWIMLCon, and Data Day. He has also published eight journal articles. He is passionate about advancing model interpretability and reducing model bias. In addition, he has extensive experience in improving MLOps to streamline the deployment and monitoring of models, reducing complexity and time. Outside of work, Zac enjoys spending time with his family in Austin and traveling the world in search of the best surfing spots.

Talk Track: Research or Advanced Technical

Talk Technical Level: 5/7

Talk Abstract:
Interpretability is crucial for improving model performance, reducing biases, and ensuring compliance with AI safety and fairness regulations. In this session, complex neural networks will be transformed from opaque “black boxes” into interpretable “glass boxes” by exploring a wide range of neural network-specific interpretability techniques. Attendees will learn about methods such as saliency maps, integrated gradients, Grad-CAM, SHAP, and activation maximization. The session will combine theoretical explanations with practical demonstrations, helping attendees effectively improve transparency and trust in neural network predictions.

What You’ll Learn
Attendees will learn how to apply various neural network interpretability techniques to understand model behavior better. They will gain insights into methods such as saliency maps, Grad-CAM for visualizing important regions in images, and integrated gradients for attributing feature importance. The session will also cover feature visualization methods to understand neuron activations and how to use layer-wise relevance propagation to track the impact of inputs through network layers. By the end of the session, participants will know how to use these tools to make neural networks more understandable and how to communicate the insights to diverse stakeholders.

Talk: Secure and Scalable MLOps Pipelines for Generative AI in Cloud-Native Architectures

Presenter:
Sumit Dahiya, Solution Architect (Vice President), Barclays Americas

About the Speaker:
Sumit Dahiya is an accomplished Solution Architect and Vice President at Barclays Americas, with over 18 years of experience in technology, cybersecurity, and digital transformation. He specializes in Identity and Access Management (IAM), cloud-native architectures, microservices, and the integration of AI/ML technologies into enterprise systems. Sumit has successfully led numerous large-scale projects across various industries, including finance, telecom, and retail.

Throughout his career, Sumit has been instrumental in designing scalable, secure infrastructures, and pioneering innovative solutions for some of the most complex technological challenges. His contributions include spearheading the deployment of Microsoft VIVA at Barclays, leading the Request/Breakglass System migration, and building cloud-based IAM frameworks that ensure enhanced security, compliance, and operational efficiency.

Sumit has been recognized with numerous awards, including the Global Recognition Award and Influencer of the Year by the Asian African Economic Forum. He is also a Top Voice on LinkedIn for System Architecture and an active mentor on platforms like ADP List and Startupbootcamp, where he helps guide future leaders in technology.

A published author and researcher, Sumit has contributed to peer-reviewed journals and conferences on topics such as cloud security, machine learning, and enterprise architecture. His recent research papers include works on AI-powered cloud solutions and Identity and Access Management integration.

In addition to his technical expertise, Sumit is a regular speaker at international conferences, sharing his insights on overcoming the challenges of Generative AI adoption, digital transformation, and scalable enterprise architectures. He has also served as a judge for prestigious awards like the Brandon Hall Excellence Awards and the Globee Leadership Awards.

Talk Track: Virtual Talk

Talk Technical Level: 5/7

Talk Abstract:
Generative AI has rapidly evolved, offering immense potential to revolutionize industries by enabling intelligent automation, creative content generation, and personalized experiences. However, deploying and managing Generative AI models at scale comes with unique challenges, particularly around security, scalability, and integration in cloud-native environments.

In this talk, we will explore how to build secure and scalable MLOps pipelines tailored for Generative AI models, addressing critical challenges in model deployment, version control, data governance, and lifecycle management. The session will focus on designing robust pipelines that can effectively handle the resource-intensive nature of Generative AI while ensuring security and compliance.

What You’ll Learn:
Architecting Scalable MLOps Pipelines for Generative AI:

How to design modular and microservices-based MLOps pipelines that scale to meet the computational demands of Generative AI models.
Techniques to optimize cloud-native infrastructure (AWS, Azure, Google Cloud) using containerization (Docker, Kubernetes) and orchestration tools for efficient resource management.
Securing the MLOps Pipeline for Generative AI:

Best practices for securing every stage of the Generative AI lifecycle, from data ingestion and model training to deployment and monitoring.
How to implement Identity and Access Management (IAM) to control and audit access to models, sensitive data, and outputs, ensuring the security of the AI workflow.
Implementing Compliance and Governance in AI Workflows:

Strategies to ensure compliance with industry regulations like GDPR, HIPAA, and SOX when deploying Generative AI models.
How to integrate governance frameworks into your MLOps pipeline to enhance transparency, bias mitigation, and ethical AI practices.
Building Continuous Integration/Continuous Deployment (CI/CD) for Generative AI:

Techniques to automate the model training and deployment process using CI/CD pipelines, ensuring that your AI systems are continuously updated and improved.
How to monitor models in production for real-time performance, detect model drift, and ensure ongoing security and compliance.
Real-World Case Studies and Practical Insights:

Learn from case studies of successful Generative AI deployments in cloud-native environments, showcasing best practices for secure, scalable, and reliable MLOps pipelines.
Practical insights into overcoming the common challenges faced when operationalizing Generative AI, including cost management, latency, and maintaining high availability.

Panel: Toyota's Generative AI Journey

Presenters:
Ravi Chandu Ummadisetti, Generative AI Architect, Toyota | Stephen Ellis, Technical Generative AI Product Manager, Toyota | Kordel France, AI Architect, Toyota | Eric Swei, AI Architect, Toyota

Talk Track: Case Study

Talk Technical Level: 2

What You’ll Learn:
1. Toyota’s Innovation Legacy
2. Leveraging LLMs in Automotive – battery, vehicle, manufacturing, etc
3. Failures in Generative AI projects
4. Education to business stakeholders

Event Speakers

Event Speakers

For Abstract, click on Speaker Card

Advanced Technical/Research

Business Strategy

Case Study

Workshop

Virtual Day

Panel Discussion

Future of AI

Speakers Corner

Lightning Talks

(more coming soon)

Tickets

Agenda

Learning

Join us

About Us

Join Our Mailer

Event Speakers

Event Speakers

For Abstract, click on Speaker Card

Advanced Technical/Research

Business Strategy

Case Study

Workshop

Virtual Day

Panel Discussion

Future of AI

Speakers Corner

Lightning Talks

(more coming soon)

Tickets

Agenda

Talk: A Practical Guide to Efficient AI

Talk: AutoGen: Enabling Next-Gen AI Applications via Multi-Agent Conversation

Talk: How to Run Your Own LLMs, From Silicon to Service

Talk: Building State-of-the-Art Chatbot Using Open Source Models and Composite Systems

Talk: DL-Backtrace by AryaXAI: A Model Agnostic Explainability for Any Deep Learning Models (LLMs to CV)

Talk: From Paper to Production in 30 Minutes: Implementing code-less Gen AI Research

Talk: Investigating the Evolution of Evaluation from Model Training to GenAI inference

Talk: Can Long-Context LLMs Truly Use Their Full Context Window Effectively?

Talk: Unleashing the Algorithm Genie: AI as the Ultimate Inventor

Talk: Build with Mistral

Talk: Agentic AI: Learning Iteratively, Acting Autonomously

Talk: Fast and Reproducible: Taming AI/ML Dependencies

Talk: Revolutionizing Cloud Storage: From Petabytes to Intelligence

Talk: Optimizing AI/ML Workflows on Kubernetes: Advanced Techniques and Integration

Talk: Driving GenAI Success in Production: Proven Approaches for Data Quality, Context, and Logging

Talk: The State-of-the-art in Software Development Agents

Talk: LLMs Alone Do Not Solve Business Problems

Talk: AI Features Demand Evidence-Based Decisions

Talk: Building Trust in AI Systems

Talk: AI in Financial Services: Emerging Trends and Opportunities

Talk: MLOps for AgenticAI: How to Manage Agents in Production

Talk: Code Generation Agents: Architecture, Data Modeling Challenges, and Production-Ready Considerations

Talk: Measuring the Minds of Machines: Evaluating Generative AI Systems

Talk: From Silos to Synergy: MLOps & Developers Unified

Talk: Enabling Safe Enterprise Adoption of Generative AI

Talk: On-Device ML for LLMs: Post-training Optimization Techniques with T5 and Beyond

Talk: Code Generation Agents: Architecture, Data Modeling Challenges, and Production-Ready Considerations

Talk: Demystifying Multi-Agent Patterns

Talk: Code Smarter, not harder: Generative AI in the Software Development Lifecycle

Talk: Generative AI Infrastructure at Lyft

Talk: GenAI ROI: From Pilot to Profit

Talk: Scaling Vector Database Usage Without Breaking the Bank: Quantization and Adaptive Retrieval

Talk: Multimodal LLMs for Product Taxonomy at Shopify

Talk: Large Language Model Training and Serving at LinkedIn

Talk: Evolving with AI: Insights from Nyla's Generative AI Journey

Talk: Toyota's Generative AI Journey

Talk: Creating Our Own Private OpenAI API

Talk: Revolutionizing Venture Capital: Leveraging Generative AI for Enhanced Decision-Making and Strategic

Talk: A Data Scientist Guide to Unit & End to End Testing

Talk: Building Agentic and Multi-Agent Systems with LangGraph

Talk: MLOps Template for Time Series in Production

Talk: Hands-on Scalable Edge-to-Core ML Pipelines

Talk: Evaluating LLM-Judge Evaluations: Best Practices

Talk: Building a Multimodal RAG: A Step-by-Step Guide for AI/ML Practitioners

Talk: Building Reliable AI: A Workshop to Build a Production Ready Multi-Modal Conversational Agent

Talk: Agentic Workflows in Cybersecurity

Talk: Open-Ended and AI-Generating Algorithms in the Era of Foundation Models

Talk: Open-Ended and AI-Generating Algorithms in the Era of Foundation Models

Talk: LLMidas' Touch; Safely Adopting GenAI for Production Use-Cases

Talk: Hemm: Holistic Evaluation of Multi-modal Generative Models

Talk: Serving GenAI Workload At Scale With LitServe

Talk: From Black Box to Mission Critical: Implementing Advanced AI Explainability and Alignment in FSIs

Talk: Building AI Applications as a Developer

Talk: RAG Hyperparameter Optimization: Translating a Traditional ML Design Pattern to RAG Applications

Talk: Multi-Graph Multi-Agent systems - Determinism through Structured Representations

Talk: Fast Data Loading for Deep Learning Workloads with lakeFS Mount

Talk: HybridRAG: Merging Knowledge Graphs with Vector Retrieval for Efficient Information Extraction

Talk: Robustness with Sidecars: Weak-To-Strong Supervision For Making Generative AI Robust For Enterprise

Talk: Revolutionizing the skies: Mlops case study of LATAM airlines

Talk: LeRobot: Democratizing Robotics

Talk: From ML Repository to ML Production Pipeline

Talk: Striking the Balance: Leveraging Human Intelligence with LLMs for Cost-Effective Annotations

Talk: ML Deployment at Faire: Predicting the Future, Serving the Present

Talk: Memory Optimizations for Machine Learning

Talk: From Black Box to Glass Box: Interpreting your Model