Episodios

  • Supervised machine learning for science with Christoph Molnar and Timo Freiesleben, Part 2
    Mar 27 2025

    Part 2 of this series could have easily been renamed "AI for science: The expert’s guide to practical machine learning.” We continue our discussion with Christoph Molnar and Timo Freiesleben to look at how scientists can apply supervised machine learning techniques from the previous episode into their research.

    Introduction to supervised ML for science (0:00)

    • Welcome back to Christoph Molnar and Timo Freiesleben, co-authors of “Supervised Machine Learning for Science: How to Stop Worrying and Love Your Black Box”

    The model as the expert? (1:00)

    • Evaluation metrics have profound downstream effects on all modeling decisions
    • Data augmentation offers a simple yet powerful way to incorporate domain knowledge
    • Domain expertise is often undervalued in data science despite being crucial

    Measuring causality: Metrics and blind spots (10:10)

    • Causality approaches in ML range from exploring associations to inferring treatment effects

    Connecting models to scientific understanding (18:00)

    • Interpretation methods must stay within realistic data distributions to yield meaningful insights

    Robustness across distribution shifts (26:40)

    • Robustness requires understanding what distribution shifts affect your model
    • Pre-trained models and transfer learning provide promising paths to more robust scientific ML

    Reproducibility challenges in ML and science (35:00)

    • Reproducibility challenges differ between traditional science and machine learning

    Go back to listen to part one of this series for the conceptual foundations that support these practical applications.

    Check out Christoph and Timo's book “Supervised Machine Learning for Science: How to Stop Worrying and Love Your Black Box” available online now.




    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    Más Menos
    42 m
  • Supervised machine learning for science with Christoph Molnar and Timo Freiesleben, Part 1
    Mar 25 2025

    Machine learning is transforming scientific research across disciplines, but many scientists remain skeptical about using approaches that focus on prediction over causal understanding.

    That’s why we are excited to have Christoph Molnar return to the podcast with Timo Freiesleben. They are co-authors of "Supervised Machine Learning for Science: How to Stop Worrying and Love your Black Box." We will talk about the perceived problems with automation in certain sciences and find out how scientists can use machine learning without losing scientific accuracy.

    • Different scientific disciplines have varying goals beyond prediction, including control, explanation, and reasoning about phenomena
    • Traditional scientific approaches build models from simple to complex, while machine learning often starts with complex models
    • Scientists worry about using ML due to lack of interpretability and causal understanding
    • ML can both integrate domain knowledge and test existing scientific hypotheses
    • "Shortcut learning" occurs when models find predictive patterns that aren't meaningful
    • Machine learning adoption varies widely across scientific fields
    • Ecology and medical imaging have embraced ML, while other fields remain cautious
    • Future directions include ML potentially discovering scientific laws humans can understand
    • Researchers should view machine learning as another tool in their scientific toolkit

    Stay tuned! In part 2, we'll shift the discussion with Christoph and Timo to talk about putting these concepts into practice.


    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    Más Menos
    27 m
  • The future of AI: Exploring modeling paradigms
    Feb 25 2025

    Unlock the secrets to AI's modeling paradigms. We emphasize the importance of modeling practices, how they interact, and how they should be considered in relation to each other before you act. Using the right tool for the right job is key. We hope you enjoy these examples of where the greatest AI and machine learning techniques exist in your routine today.

    More AI agent disruptors (0:56)

    • Proxy from London start-up Convergence AI
    • Another hit to OpenAI, this product is available for free, unlike OpenAI’s Operator.

    AI Paris Summit - What's next for regulation? (4:40)

    • [Vice President] Vance tells Europeans that heavy regulation can kill AI
    • US federal administration withdrawing from the previous trend of sweeping big tech regulation on modeling systems.
    • The EU is pushing to reduce bureaucracy but not regulatory pressure

    Modeling paradigms explained (10:33)

    • As companies look for an edge in high-stakes computations, we’ve seen best-in-class rediscovering expert system-based techniques that, with modern computing power, are breathing new light into them.
      • Paradigm 1: Agents (11:23)
      • Paradigm 2: Generative (14:26)
      • Paradigm 3: Mathematical optimization (regression) (18:33)
      • Paradigm 4: Predictive (classification) (23:19)
      • Paradigm 5: Control theory (24:37)

    The right modeling paradigm for the job? (28:05)


    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    Más Menos
    34 m
  • Agentic AI: Here we go again
    Feb 1 2025

    Agentic AI is the latest foray into big-bet promises for businesses and society at large. While promising autonomy and efficiency, AI agents raise fundamental questions about their accuracy, governance, and the potential pitfalls of over-reliance on automation.

    Does this story sound vaguely familiar? Hold that thought. This discussion about the over-under of certain promises is for you.


    Show Notes


    The economics of LLMs and DeepSeek R1 (00:00:03)

    • Reviewing recent developments in AI technologies and their implications
    • Discussing the impact of DeepSeek’s R1 model on the AI landscape, NVIDIA


    The origins of agentic AI (00:07:12)

    • Status quo of AI models to date: Is big tech backing away from promise of generative AI?
    • Agentic AI designed to perceive, reason, act, and learn


    Governance and agentic AI (00:13:12)

    • Examining the tension between cost efficiency and performance risks [LangChain State of AI Agents Report]
    • Highlighting governance concerns related to AI agents


    Issues with agentic AI implementation (00:21:01)

    • Considering the limitations of AI agents and their adoption in the workplace
    • Analyzing real-world experiments with AI agent technologies, like Devin


    What's next for complex and agentic AI systems (00:29:27)

    • Offering insights on the cautious integration of these systems in business practices
    • Encouraging a thoughtful approach to leveraging AI capabilities for measurable outcomes

    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    Más Menos
    30 m
  • Contextual integrity and differential privacy: Theory vs. application with Sebastian Benthall
    Jan 7 2025

    What if privacy could be as dynamic and socially aware as the communities it aims to protect? Sebastian Benthall, a senior research fellow from NYU’s Information Law Institute, shows us how privacy is complex. He uses Helen Nissenbaum’s work with contextual integrity and concepts in differential privacy to explain the complexity of privacy. Our talk explains how privacy is not just about protecting data but also about following social rules in different situations, from healthcare to education. These rules can change privacy regulations in big ways.

    Show notes

    Intro: Sebastian Benthall (0:03)

    • Research: Designing Fiduciary Artificial Intelligence (Benthall, Shekman)
    • Integrating Differential Privacy and Contextual Integrity (Benthall, Cummings)

    Exploring differential privacy and contextual integrity (1:05)

    • Discussion about the origins of each subject
    • How are differential privacy and contextual integrity used to enforce each other?

    Accepted context or legitimate context? (9:33)

    • Does context develop from what society accepts over time?
    • Approaches to determine situational context and legitimacy

    Next steps in contextual integrity (13:35)

    • Is privacy as we know it ending?
    • Areas where integrated differential privacy and contextual integrity can help (Cummings)

    Interpretations of differential privacy (14:30)

    • Not a silver bullet
    • New questions posed from NIST about its application

    Privacy determined by social norms (20:25)

    • Game theory and its potential for understanding social norms

    Agents and governance: what will ultimately decide privacy? (25:27)

    • Voluntary disclosures and the biases it can present towards groups that are least concerned with privacy
    • Avoiding self-fulfilling prophecy from data and context



    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    Más Menos
    33 m
  • Model documentation: Beyond model cards and system cards in AI governance
    Nov 9 2024

    What if the secret to successful AI governance lies in understanding the evolution of model documentation? In this episode, our hosts challenge the common belief that model cards marked the start of documentation in AI. We explore model documentation practices, from their crucial beginnings in fields like finance to their adaptation in Silicon Valley. Our discussion also highlights the important role of early modelers and statisticians in advocating for a complete approach that includes the entire model development lifecycle.

    Show Notes

    Model documentation origins and best practices (1:03)

    • Documenting a model is a comprehensive process that requires giving users and auditors clear understanding:
      • Why was the model built?
      • What data goes into a model?
      • How is the model implemented?
      • What does the model output?


    Model cards - pros and cons (7:33)

    • Model cards for model reporting, Association for Computing Machinery
    • Evolution from this research to Google's definition to today
    • How the market perceives them vs. what they are
    • Why the analogy “nutrition labels for models” needs a closer look


    System cards - pros and cons (12:03)

    • To their credit, OpenAI system cards somewhat bridge the gap between proper model documentation and a model card.
    • Contains complex descriptions of evaluation methodologies along with results; extra points for reporting red-teaming results
    • Represents 3rd-party opinions of the social and ethical implications of the release of the model


    Automating model documentation with generative AI (17:17)

    • Finding the balance in automation in a great governance strategy
    • Generative AI can provide an assist in editing and personal workflow


    Improving documentation for AI governance (23:11)

    • As model expert, engage from the beginning with writing the bulk of model documentation by hand.
    • The exercise of documenting your models solidifies your understanding of the model's goals, values, and methods for the business

    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    Más Menos
    28 m
  • New paths in AI: Rethinking LLMs and model risk strategies
    Oct 8 2024

    Are businesses ready for large language models as a path to AI? In this episode, the hosts reflect on the past year of what has changed and what hasn’t changed in the world of LLMs. Join us as we debunk the latest myths and emphasize the importance of robust risk management in AI integration. The good news is that many decisions about adoption have forced businesses to discuss their future and impact in the face of emerging technology. You won't want to miss this discussion.

    • Intro and news: The veto of California's AI Safety Bill (00:00:03)
      • Can state-specific AI regulations really protect consumers, or do they risk stifling innovation? (Gov. Newsome's response)
      • Veto highlights the critical need for risk-based regulations that don't rely solely on the size and cost of language models
      • Arguments to be made for a cohesive national framework that ensures consistent AI regulation across the United States
    • Are businesses ready to embrace large language models, or are they underestimating the challenges? (00:08:35)
      • The myth that acquiring a foundational model is a quick fix for productivity woes
      • The essential role of robust risk management strategies, especially in sensitive sectors handling personal data
      • Review of model cards, Open AI's system cards, and the importance of thorough testing, validation, and stricter regulations to prevent a false sense of security
      • Transparency alone is not enough; objective assessments are crucial for genuine progress in AI integration
    • From hallucinations in language models to ethical energy use, we tackle some of the most pressing problems in AI today (00:16:29)
      • Reinforcement learning with annotators and the controversial use of other models for review
      • Jan LeCun's energy systems and retrieval-augmented generation (RAG) offer intriguing alternatives that could reshape modeling approaches
    • The ethics of advancing AI technologies, consider the parallels with past monumental achievements and the responsible allocation of resources (00:26:49)
      • There is good news about developments and lessons learned from LLMs; but there is also a long way to go.
      • Our original predictions in episode 2 for LLMs still reigns true: “Reasonable expectations of LLMs: Where truth matters and risk tolerance is low, LLMs will not be a good fit”
      • With increased hype and awareness from LLMs came varying levels of interest in how all model types and their impacts are governed in a business.


    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    Más Menos
    40 m
  • Complex systems: What data science can learn from astrophysics with Rachel Losacco
    Sep 4 2024

    Our special guest, astrophysicist Rachel Losacco, explains the intricacies of galaxies, modeling, and the computational methods that unveil their mysteries. She shares stories about how advanced computational resources enable scientists to decode galaxy interactions over millions of years with true-to-life accuracy. Sid and Andrew discuss transferable practices for building resilient modeling systems.

    • Prologue: Why it's important to bring stats back [00:00:03]
      • Announcement from the American Statistical Association (ASA): Data Science Statement Updated to Include “ and AI”
    • Today's guest: Rachel Losacco [00:02:10]
      • Rachel is an astrophysicist who’s worked with major galaxy formation simulations for many years. She hails from Leiden (Lie-den) University and the University of Florida. As a Senior Data Scientist, she works on modeling road safety.
    • Defining complex systems through astrophysics [00:02:52]
      • Discussion about origins and adoption of complex systems
      • Difficulties with complex systems: Nonlinearity, chaos and randomness, collective dynamics and hierarchy, and emergence.
    • Complexities of nonlinear systems [00:08:20]
      • Linear models (Least Squares, GLMs, SVMs) can be incredibly powerful but they cannot model all possible functions (e.g. a decision boundary of concentric circles)
      • Non-linearity and how it exists in the natural world
    • Chaos and randomness [00:11:30]
      • Enter references to Jurassic Park and The Butterfly Effect
      • “In universe simulations, a change to a single parameter can govern if entire galaxy clusters will ever form” - Rachel
    • Collective dynamics and hierarchy [00:15:45]
      • Interactions between agents don’t occur globally and often is mediated through effects that only happen on specific sub-scales
      • Adaptation: components of systems breaking out of linear relationships between inputs and outputs to better serve the function of the greater system
    • Emergence and complexity [00:23:36]
      • New properties arise from the system that cannot be explained by the base rules governing the system
    • Examples in astrophysics [00:24:34]
      • These difficulties are parts of solving previously impossible problems
      • Consider this lecture from IIT Delhi on Complex Systems to get a sense of what is required to study and formalize a complex system and its collective dynamics (https://www.youtube.com/watch?v=yJ39ppgJlf0)
    • Consciousness and reasoning from a new point of view [00:31:45]
      • Non-linearity, hierarchy, feedback loops, and emergence may be ways to study consciousness. The brain is a complex system that a simple set of rules cannot fully define.
      • See: Brain modeling from scratch of C. Elgans



    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    Más Menos
    41 m