
What we’re about
Silicon Valley Generative AI is a dynamic community of professionals, researchers, startup founders, and enthusiasts who share a passion for generative AI technology. As part of the wider GenAI Collective network, the group provides a fertile ground for the exploration of cutting-edge research, applications, and discussions on all things related to generative AI.
Our community thrives on two main types of engagement. Firstly, in partnership with Boulder Data Science, we host bi-weekly "Paper Reading" sessions. These meetings are designed for deep-dives into the latest machine learning papers, fostering a culture of continuous learning and collaborative research. It's an excellent opportunity for anyone looking to understand the nitty-gritty scientific advancements propelling the field forward.
Secondly, we organize monthly "Talks" that offer a broader range of insights into the world of generative AI. These sessions feature presentations by an eclectic mix of speakers, from industry pioneers and esteemed researchers to emergent startup founders and subject matter experts. Unlike the paper reading sessions, which are more academically inclined, the talks are tailored to appeal to a more general audience. Topics can span the gamut from the technical intricacies of the latest generative models to their real-world applications, startup pitches, and even discussions on the legal and ethical implications of AI.
Whether you're a seasoned professional or merely curious about generative AI, Silicon Valley Generative AI provides a comprehensive platform to learn, discuss, and network.
We strive to be an inclusive community that fosters innovation, knowledge-sharing, and a collective drive to shape the future of AI responsibly. Join us to stay at the forefront of generative AI research, news, and applications.
For those eager to dive deeper into the technical aspects, you can join us on the GenAI Collective Slack, specifically the #discuss-technical channel, to keep the conversations flowing between meetups.
We are also looking for the following:
• Readers: people who are willing to read papers and speak about them.
• Speakers and presenters: who will put together educational materials and present to the group as well as answer questions.
• Industry events: if you have a generative AI event like a hackathon, lunch and learn or an information session on your product, we would be happy to include in the calendar.
Please contact Matt White here or at contact@matt-white.com
Upcoming events (4+)
See all- Reinforcement Learning: Chapter 3 Finite Markov Decision ProcessesLink visible for attendees
Chapter 3 introduces the mathematical formalism for defining the full reinforcement learning problem in the book. We will cover the definition of probability transition functions, reward signals, and the discounted return. If there is time we will continue with the discussion of policies and value functions as explained with the gridworld example.
As usual you can find below links to the textbook, previous chapter notes, slides, and recordings of some of the previous meetings.
Useful Links:
Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto
Recordings of Previous Meetings
Short RL Tutorials
My exercise solutions and chapter notes
Kickoff Slides which contain other links
Video lectures from a similar course - Generative AI Paper Reading: Illusion of Thinking and RebuttalLink visible for attendees
Join us for a paper discussion by Megan Robertson on "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity" and its rebuttal "The Illusion of the Illusion of Thinking"
Examining the capabilities and evaluation pitfalls of Large Reasoning Models (LRMs) in complex reasoning tasks***
## Featured Papers
- "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity" (Shojaee et al., 2025)
arXiv Paper - "The Illusion of the Illusion of Thinking: A Comment on Shojaee et al. (2025)" (Opus, Lawsen, 2025)
arXiv Rebuttal
***
## Discussion Topics
## Original Study: LRM Reasoning and Scaling Analysis
- Evaluation of Large Reasoning Models (LRMs) on systematically designed puzzles (e.g., Tower of Hanoi, River Crossing) to probe reasoning depth and trace quality
- Identification of three performance regimes:
- Standard LLMs outperform LRMs on low-complexity tasks.
- LRMs show advantage on medium-complexity tasks.
- Both model types experience "accuracy collapse" on high-complexity tasks, even with sufficient token budget
- Observed phenomena:
- Reasoning effort increases with problem complexity, then unexpectedly declines at higher levels.
- LRMs struggle with exact computation, often failing to use explicit algorithms or reason consistently across different scales.
- Internal reasoning traces reveal inconsistent or incomplete solution exploration as complexity rises.
## Rebuttal: Experimental Design and Evaluation Critique
- Argues that reported "accuracy collapse" is largely due to experimental artifacts rather than fundamental model limitations.
- Key criticisms:
- Token Limitations: For tasks like Tower of Hanoi, the number of required output tokens grows rapidly with problem size, exceeding model context windows at collapse points. Models sometimes explicitly state output truncation due to length constraints.
- Evaluation Framework: Automated scoring does not distinguish between genuine reasoning failures and practical output constraints, leading to misclassification of model capabilities.
- Impossible Tasks: Some River Crossing instances tested are mathematically unsolvable (e.g., too many actors for boat capacity), yet models are penalized for not solving these, misrepresenting their reasoning ability.
- Alternative Evaluation: When models are asked for algorithmic solutions (e.g., code that generates the answer) rather than exhaustive move lists, they succeed on problems previously reported as failures, demonstrating intact reasoning when freed from output format constraints.
- Complexity Metrics: The rebuttal argues that solution length alone is a poor proxy for problem difficulty; true complexity depends on branching factor and search requirements, not just output size.
***
## Performance Benchmarks and Analysis
| Task/Metric | LRM (Shojaee et al.) | Rebuttal Findings |
| ----------- | -------------------- | ----------------- |
| Tower of Hanoi | Collapse at N=8 | Collapse coincides with output token limits, not reasoning |
| River Crossing | Failure at N≥6 | Tasks are unsolvable; penalizing models is invalid |
| Reasoning Traces | Inconsistent scaling | Models can generate correct algorithms when format allows |***
## Implementation Challenges
- Distinguishing between reasoning limitations and practical model constraints (token budget, output format).
- Designing evaluation protocols that verify task solvability and use complexity metrics reflecting computational difficulty, not just solution length
- Ensuring that automated scoring frameworks do not misclassify model outputs due to rigid requirements.
***
## Key Technical Insights
- Evaluation Design: Careful experimental setup is critical to avoid conflating output constraints with reasoning ability.
- Model Awareness: LRMs can recognize and adapt to output limits, sometimes explicitly signaling when truncation occurs.
- Alternative Representations: Requesting compact algorithmic outputs can reveal reasoning capabilities hidden by exhaustive enumeration tasks.
- Complexity Considerations: True problem difficulty is a function of search and branching, not just the number of output steps.
***
## Future Directions
- Develop evaluation strategies that separate reasoning from output limitations
- Incorporate multiple solution representations (e.g., code, high-level plans) to better assess model understanding.
- Verify puzzle solvability before benchmarking models on complex tasks.
Silicon Valley Generative AI has two meeting formats:
1. Paper Reading - Every second week we meet to discuss machine learning papers. This is a collaboration between Silicon Valley Generative AI and Boulder Data Science.
2. Talks - Once a month we meet to have someone present on a topic related to generative AI. Speakers can range from industry leaders, researchers, startup founders, subject matter experts and those with an interest in a topic and would like to share. Topics vary from technical to business focused. They can be on how the latest in generative models work and how they can be used, applications and adoption of generative AI, demos of projects and startup pitches or legal and ethical topics. The talks are meant to be inclusive and for a more general audience compared to the paper readings.If you would like to be a speaker or suggest a paper email us @ svb.ai.paper.suggestions@gmail.com or join our new discord !!!
- "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity" (Shojaee et al., 2025)
- Reinforcement Learning: Topic TBALink visible for attendees
Typically covers chapter content from Sutton and Barto's RL book
As usual you can find below links to the textbook, previous chapter notes, slides, and recordings of some of the previous meetings.
Useful Links:
Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto
Recordings of Previous Meetings
Short RL Tutorials
My exercise solutions and chapter notes
Kickoff Slides which contain other links
Video lectures from a similar course