LL2 – How to Set Context in ChatGPT: Lessons from the Marshmallow Tower
I've always been fascinated by how small changes in structure and process can lead to dramatic shifts in what you can accomplish.
I remember feeling this way as a kid playing role-playing games and learning the optimal min-max build to make my character the strongest he could be.
I felt it when I studied Industrial Engineering in undergrad, learning about how the basic principles of the Toyota Production system drove a small Japanese car manufacturer to take over the automotive world with a fraction of the resources.
I'm feeling it especially now with AI.
If you’re here, you probably feel the same way.
You're probably already be using tools like ChatGPT or Claude for quick brainstorming or problem-solving. You've tried to use them for larger problems, but were discouraged when they generated vague answers or general responses. Maybe they even hallucinated and made up complete nonsense.
You aren't sure if this is all just empty hype.
Today, I want to share a simple analogy for better AI tool use: the “marshmallow tower” team-building exercise.
It’s a game that highlights the power of iterating fast and layering in the right supports —I've found it to be the perfect analogy for how I setup and manage context when I use these AI-based systems.
Yes, it's a bit childish, but it might just level up the size of the problem you can throw at these LLMs.
The Marshmallow Tower: Why It Matters
In the classic challenge, you’re handed some spaghetti, tape, and a marshmallow and asked to build the tallest tower that can support the marshmallow on top. The thing that surprises most people is that Kindergarteners often do better than adults.
Why is that?
Because they’re not afraid to fail quickly and iterate.
They put the marshmallow on top early, watch it collapse, and learn exactly where the structure needs reinforcement.
This same principle applies to AI interactions. Every piece of context you feed your AI models is like a beam or pillar that prevents your “tower” from collapsing under its own complexity. If you wait until the end to realize you’re missing a critical piece of context, your final deliverable might just topple at the last second.
Context: The Foundation of Your AI Builds
When you work on designing specialized AI agents (like a “Creative Brainstorming Agent” or a “Structured Planner”), you'll discovered the key to their success wasn’t fancy prompts or secret hacks. Instead, it was context.
If you only give your AI a fraction of the background it needs, it’ll struggle—even if it’s the most advanced model out there. But as you start to integrate more relevant details (like internal data, constraints, or specific project goals), your AI can suddenly stand taller. The difference is stark; without a strong base, your entire process wobbles.
Pro tip: You don’t need every piece of data in the universe. Provide just enough context to solve your immediate problem. If the tower is still wobbly, figure out which extra data or clarifications would reinforce it, then layer those in.
My “Accidental” Discovery of Context Mining
For a long time, I was unstructured in how I prompted ChatGPT. I was lucky to learn how important context was for ChatGPT as I developed an early preference for voice input. It's far easier to give full context when you don't need to type and can just talk to it like you would a person.
While I learned to give large amounts of context to AI very early, I wasn't diligent about retrieving context and starting over with it. I meandered and stretched chats far too long trying to iterate towards my goal. That sometimes led to partial success—but often, it was a mess of guesswork. I’d eventually realize I had crucial information stashed away in a Notion page or a conversation with a colleague that never made it into the AI’s context window.
That’s where I learned to do what I call “context mining.”
Instead of waiting for the tower to collapse, I’d proactively gather and organize key insights in one place. With each iteration, I’d test the tower, watch for weak spots, and summarize what I learned —like capturing how certain constraints or user stories changed the final output. If you're a gamer, it's like restarting a video game but with resources you picked up on your previous play through - the next run may go even further.
The result? Less repetitive trial-and-error and fewer topplings.
Sequential vs. Parallel Context Loading
You can think of two key pathways when it comes to loading context for your LLMs:
1) Sequential Loading (Exploring & Mining)
Think of this like short “research sprints.” You build a small structure with whatever context you have. You ask broad, exploratory questions and learn as you go. You collect bits and pieces from each entry, eventually summarizing it to pull out and begin another chat with the context you "mined" from the previous. It’s less of an investment than trying to structure a meticulous prompt from the beginning when you aren't sure what you actually need. If you’re uncertain about how much detail you need, this method helps you learn fast without committing all your resources up front.
2) Full Load (Packing for the Long Haul)
Here, you grab as much context as you can—user insights, constraints, data, everything—and load it all before building. It’s efficient if you’re already clear on what matters, but it will send you in the completely wrong direction if your context is pointed in the wrong direction. I'll often do a little bit of both. One of my favorite things to do is dumping context into ChatGPT while I'm walking to the subway. I can throw a massive amount of unstructured thoughts and ideas into it and turn that chaos into structured context for me to react to and use for further explorations.
It’s oftentimes better to test early, fail quickly, and pivot—like the kindergarteners with their marshmallows. Both approaches work, but your choice should match the complexity of your project and your confidence in what you need to succeed.
Iterative Refinement & Banking Your Insights
You won’t get a perfect tower in one try. That’s true whether you’re building a big marketing campaign or scoping out a product requirements document with significant downstream dependencies. You have to:
Prototype Quickly: Start with minimal context and see if the AI stands on its own.
Observe Gaps: Which aspects of the problem did it fail to address? Why?
Reinforce: Feed the missing context back in.
Build Higher: With a stronger base, tackle more complex questions.
Over time, you accumulate a “context bank” of proven, reliable details you can reuse. You learn processes for establishing and maintaining, and blending these distinct context banks for the problems you're trying to tackle.
This is how you eventually “train” your AI team to handle more intricate tasks without repeated oversights.
Common Pitfalls & Success Patterns
Pitfall #1: Over-reliance on Guesswork
Don’t assume your AI “just knows.” If the tower collapses under the weight of new details, it’s because you left foundational details out, causing your LLM to deviate wildly from the direction you needed it to go. Verify you're on stable ground before trying to build higher.
Pitfall #2: Late Context Corrections
Cramming in context at the eleventh hour can be chaotic and might force you to rebuild more than you planned. Establishing a wide base of context grounds your LLM - each subsequent query draws from the pool of context. If you add it in too late, it will drastically change the way it responds afterwards.
Pitfall #3: Poor Organization
If your data is scattered across different docs, tools, or Slack threads, you might miss the perfect supporting structure you already had. Getting the most out of AI means rethinking how you organize your documentation and how many calories you spend on updating and pruning your notes.
Success Patterns:
Start Small, Fail Fast: Quick, scrappy conversations with broad strokes "map the playing field" and identify how much context you'll need to establish for your goal.
Rigorous Validation: Each new piece of context should have a purpose—don’t add fluff.
Ongoing “Mining”: Keep refining your knowledge base so future builds are easier. Use conversational runs specifically to retrieve context.
Seeing the Big Picture
Here’s what I’ve come to realize: context is fractal. Each sub-problem is its own mini-tower, and each mini-tower needs its own stable foundation. The more you iterate and refine these foundational blocks, the easier it becomes to add layers of complexity without everything toppling over.
This gets easier over time.
You'll start to get a sense for how much context you need to provide to solve the problems in front of you. You may realize it's faster to break a larger problem into sub-problems that can each be solved with smaller "context towers." You can use those smaller towers as context for your LLM to then solve that initial problem.
And just like assembling an AI team with specialized roles, building a robust context structure doesn’t happen overnight. But if you commit to iterative improvement—failing fast, learning often—you’ll see your tower grow taller and stronger with every step.
Your Call to Action
The next time you turn to AI to dive into a difficult problem, pause and ask: do I have enough “spaghetti, tape, and context” to get a stable first build? If not, do some quick sprints to gather what’s missing.
Don’t be afraid of that initial wobble; it’s feedback that helps you figure out which details matter.
If you prefer loading everything at once, go for it - this usually yields the best results provided you have all your ducks in a row. Be ready to pivot if you discover you missed a critical piece along the way.
By intentionally mining for context and testing early, you’ll waste less time reworking core assumptions. You’ll see more consistent, higher-quality results from your AI, and you’ll feel more in control of the entire process. It’s a small mindset shift that can produce massive gains in productivity—and, more importantly, in the quality of what you build.
If you found this newsletter useful, go ahead and subscribe if you haven’t already!
I’ll keep sharing ways to refine your AI systems, design specialized agents, and uncover insights that help you work smarter—not harder.
Until next time,
Brandon Galang
Senior Managing Director
6moBrandon Galang Fascinating read. Thank you for sharing
Follow for Practical AI & Systems Thinking | Product Leader @ Wonder | Ex-Amazon | Yale MBA
6moThere’s a place for both - cues from reinforcement learning to optimize agent system prompts in production for example. This post is mostly about human-driven use of LLMs as tools though.
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
6moIt's fascinating how this mental model echoes the scientific method, where initial hypotheses are tested and refined through observation and iteration. Remember the early days of AI, when experts relied heavily on rule-based systems that struggled with real-world ambiguity? This approach seems like a powerful antidote to those limitations. Given your emphasis on upfront organization, how do you envision this model integrating with emergent techniques like reinforcement learning, which often involve iterative policy optimization through trial and error?