How I Use ChatGPT in My Daily PM Workflow

Most "how I use AI" posts are useless. "I use it to brainstorm" and "it helps me think through problems" - okay, but what does that actually look like? What prompts? What failure modes have you hit? What do you not use it for?

Here's what I actually do, with enough specificity to be useful.

Understanding the tool before using it

Before getting into workflows, it's worth being honest about what ChatGPT is and isn't. It's a large language model - a stochastic text predictor trained on a massive corpus of text. The "temperature" setting controls output variance: lower temperature means more predictable, higher means more creative but also more likely to drift. The context window is the amount of text the model can "see" at once - GPT-4 has a large context window, which matters when you're pasting in long documents.

It doesn't have memory between sessions unless you use the memory feature or build it into your system prompt. It hallucinates - confidently produces false information - especially on specific facts, recent events, and anything that requires precise recall. It's good at synthesis, pattern recognition, and generating plausible-sounding text. It's bad at novel reasoning, precise arithmetic, and anything that requires ground truth it wasn't trained on.

Knowing this changes how you use it. You don't ask it to verify facts. You don't trust its specific numbers. You use it for tasks where "plausible and well-structured" is the goal, and you verify the parts that need to be accurate.

The prompts I actually use

For processing messy notes from discovery sessions, I paste the raw notes and use a system prompt I've refined over time: "You are helping a technical product manager synthesize discovery notes. Identify the core user problems, separate stated problems from inferred ones, flag any contradictions, and note what's missing. Be skeptical - don't assume the user's stated solution is the right one." The "be skeptical" instruction matters. Without it, the model tends to validate whatever framing is in the notes.

For spec review, I paste the draft and ask: "What would an experienced backend engineer find ambiguous or underspecified in this? Focus on edge cases, error states, and anything that requires a decision the spec doesn't make." This is genuinely useful. It catches things like "what happens if the user closes the app mid-payment?" or "what's the timeout behavior on this API call?" - questions I should have answered but didn't.

For understanding technical concepts quickly, I use a specific framing: "Explain [concept] to someone who has written production code but hasn't worked with this specific technology. Focus on the tradeoffs and failure modes, not the happy path." This gets me past the tutorial-level explanation to the stuff that actually matters for product decisions. When I needed to understand how our message queue handles backpressure before writing a spec for a high-volume notification feature, this framing got me to a useful mental model in about fifteen minutes.

Where it fails and why

The context window limitation bites you in specific ways. If you paste a long document and ask questions about it, the model's attention degrades toward the end of the document. Important details in the last third of a long spec get less weight than details at the beginning. I've learned to put the most important constraints at the top of whatever I'm pasting.

RAG - Retrieval Augmented Generation - exists because of this problem. Instead of stuffing everything into the context window, you retrieve only the relevant chunks. I don't build RAG pipelines for my daily work, but understanding why RAG exists helps me understand why ChatGPT gives worse answers when I paste in too much context.

The other failure mode is what I call "confident synthesis of wrong premises." If my notes contain a wrong assumption, the model will synthesize it into a coherent-sounding output that's built on that wrong assumption. It doesn't push back. It doesn't say "wait, that doesn't make sense." It just produces well-structured text that inherits the error. This is why I always read the output critically, not as a finished product.

⚠️

If your notes contain a wrong assumption, the LLM will synthesize it into a confident, well-structured document built on that wrong assumption. It won't push back. Read the output critically every time.

What I don't use it for

I don't use it to make prioritization decisions. "Should we build feature X or feature Y?" is not a question it can answer. It doesn't know our user behavior data, our engineering capacity, our competitive situation, or the hundred small things that inform that call. It will give me a balanced answer that considers both sides, which is exactly useless.

I don't use it to generate product ideas. The ideas it generates are statistically average - they're what a PM blog post would suggest, not what our specific users actually need. User research generates better ideas than any LLM, because user research is grounded in specific people with specific problems.

I don't use it to write specs from scratch. The spec needs to reflect my understanding of the problem. If I haven't done the thinking, the spec will be wrong regardless of how well-formatted it is. AI-generated specs look like specs. They have the right sections and the right language. They're also wrong in ways that aren't obvious until an engineer spends a week building the wrong thing.

The honest accounting

ChatGPT saves me maybe 45 minutes a day on synthesis and editing tasks. That's real. It doesn't make me a better product thinker - that comes from talking to users through support tickets, working with my engineering team, and making decisions and seeing what happens.

The risk is using it as a substitute for thinking rather than a tool for thinking. Reasonable-looking output built on shallow thinking is worse than rough output built on real understanding, because it's harder to spot the problem. Use it for the work that benefits from speed. Do the thinking yourself.