Data Hurdles

De: Michael Burke and Chris Detzel
  • Resumen

  • Data Hurdles is a podcast that brings the stories of data professionals to life, showcasing the challenges, triumphs, and insights from those shaping the future of data. Hosted by Michael Burke and Chris Detzel, this podcast dives into the real-world experiences of data experts as they navigate topics like data quality, security, AI, data literacy, and machine learning. Each episode features guest data professionals who share their journeys, lessons learned, and the impact of data on industries, technology, and society. From overcoming obstacles in data pipelines to implementing groundbreaking AI solutions, Data Hurdles highlights the human side of data and the stories behind the innovations that are transforming the world. Join us to hear firsthand accounts of how data professionals are solving complex problems and driving the future of technology.
    2024 All rights Reserved
    Más Menos
Episodios
  • DeepSeek's Cost-Efficient Model Training ($5M vs hundreds of millions for competitors)
    Feb 22 2025

    The episode features hosts Chris Detzel and Michael Burke discussing DeepSeek, a Chinese AI company making waves in the large language model (LLM) space. Here are the key discussion points:

    Major Breakthrough in Cost Efficiency:
    - DeepSeek claimed they trained their latest model for only $5 million, compared to hundreds of millions or billions spent by competitors like OpenAI
    - This cost efficiency created market disruption, particularly affecting NVIDIA's stock as it challenged assumptions about necessary GPU resources

    Mixture of Experts (MoE) Innovation:
    - Instead of using one large model, DeepSeek uses multiple specialized "expert" models
    - Each expert model focuses on specific areas/topics
    - Uses reinforcement learning to route queries to the appropriate expert model
    - This approach reduces both training and inference costs
    - DeepSeek notably open-sourced their MoE architecture, unlike other major companies

    Technical Infrastructure:
    - Discussion of how DeepSeek achieved results without access to NVIDIA's latest GPUs
    - Highlighted the dramatic price increase in NVIDIA GPUs (from $3,000 to $30,000-$50,000) due to AI demand
    - Explained how inference costs (serving the model) often exceed training costs

    Chain of Thought Reasoning:
    - DeepSeek open-sourced their chain of thought reasoning system
    - This allows models to break down complex questions into steps before answering
    - Improves accuracy on complicated queries, especially math problems
    - Comparable to Meta's LLAMA in terms of open-source contributions to the field

    Broader Industry Impact:
    - Discussion of how businesses are integrating AI into their products
    - Example of ZoomInfo using AI to aggregate business intelligence and automate sales communications
    - Noted how technical barriers to AI implementation are lowering through platforms like Databricks

    The hosts also touched on data privacy concerns regarding Chinese tech companies entering the US market, drawing parallels to TikTok discussions. They concluded by discussing how AI tools are making technical development more accessible to non-experts and mentioned the importance of being aware of how much personal information these models collect about users.

    Más Menos
    25 m
  • Clean Data, Business Context, and the Future of Analytics - Featuring Noy Twerski, Sherloq Co-founder & CEO
    Feb 17 2025

    This episode of Data Hurdles features an in-depth conversation with Noy Twerski, CEO and Co-founder of Sherloq, a collaborative SQL repository platform. The discussion, hosted by Chris Detzel and Michael Burke, explores several key themes in data analytics and management.

    Key Topics Covered:

    1. Introduction to Sherloq
    - Sherloq is introduced as a plugin that integrates with various SQL editors including Databricks, Snowflake, and JetBrains editors
    - The platform serves as a centralized repository for SQL queries, addressing the common problem of scattered SQL code across organizations

    2. Origin Story
    - Twerski shares her background as a product manager who experienced firsthand the challenges of managing SQL queries
    - The company was founded about 2.5 years ago with her co-founder Nadav, whom she knew from computer science undergrad
    - They identified the problem through extensive user research, finding that 80% of data analysts struggled with locating their tables, fields, and SQL

    3. Business Context and AI Discussion
    - A significant portion of the conversation focuses on the relationship between SQL, business context, and AI
    - The hosts and guest discuss the challenges of automating SQL generation through AI, emphasizing the importance of business context
    - They explore why text-to-SQL solutions are more complex than they appear, particularly in enterprise settings

    4. Future Outlook
    - Discussion of Sherloq's future plans, focusing on deepening their collaborative SQL repository capabilities
    - Exploration of how the platform could serve as infrastructure for future AI capabilities
    - Consideration of data quality as an ongoing challenge in the enterprise data space

    5. Industry Insights
    - The conversation includes broader discussions about data quality, governance, and the evolution of data teams
    - Twerski shares insights about different user personas and how they approach the product differently

    Notable Aspects:
    - The podcast includes interesting perspectives on the future of data analytics and AI
    - There's a strong emphasis on practical business applications and real-world challenges
    - The hosts and guest share thoughtful insights about data quality as a persistent challenge in the industry

    The episode provides valuable insights for data professionals, particularly those interested in data management, SQL development, and the evolution of data tools in an AI-driven landscape.

    Más Menos
    34 m
  • Top 10 MDM 2025 Platforms - Who's Rising, Who's Falling & Why It Matters
    Dec 1 2024

    The Data Hurdles Impact Index (DHII) provides a comprehensive analysis of the top Master Data Management platforms, evaluating vendors based on multi-domain capabilities, core features, AI enablement, data governance integration, architecture flexibility, total cost of ownership, market reach, and vendor stability. This inaugural DHII analysis covers ten leading MDM platforms that are shaping enterprise data management in 2025.

    The assessment, led by 20-year MDM veteran Rohit Singh Verma, Director - Data practice, Nvizion Solutions, examines market leaders and emerging players including Informatica, Stibo Systems, Profisee, Reltio, Ataccama, TIBCO EBX, IBM Infosphere MDM, SAP MDM, Syndigo, and Viamedic. Each vendor is evaluated through the lens of practical implementation experience, market presence, and technological innovation.

    Key findings reveal Informatica's continued dominance with their IDMC cloud offering, though facing increasing pressure in specific domains from specialists like Stibo Systems in product data management. The analysis highlights a significant market opportunity in the Middle East, where only select vendors have established strong presences. The DHII also identifies critical factors beyond technical capabilities, including the importance of system integrator networks, implementation speed, and regional market penetration.

    The evaluation exposes interesting market dynamics, such as the challenges faced by legacy vendors like IBM and SAP in keeping pace with cloud-native solutions, and the emergence of AI-enabled capabilities as a key differentiator. The analysis also addresses the persistent challenge of high implementation failure rates (estimated at 75%) and how vendors are evolving to address this through improved user interfaces, AI-assisted implementations, and stronger partner ecosystems.

    This groundbreaking DHII assessment serves as an essential guide for organizations navigating the complex MDM vendor landscape, offering insights that go beyond traditional analyst evaluations to provide a practical, implementation-focused perspective on the market's leading solutions.

    Más Menos
    1 h y 7 m

Lo que los oyentes dicen sobre Data Hurdles

Calificaciones medias de los clientes

Reseñas - Selecciona las pestañas a continuación para cambiar el origen de las reseñas.