My exploratory AI workflow

I’ve been spending time doing some work with AI – and this blog is just a “check-in” journal on what has been working for me.

This blog will begin with “Idea Honing” (mini PRD) and then move to an iterative development workflow, followed by an example project.

Idea Honing

You need to spend some time honing your idea. There is a great blog on this subject which contains a great prompt to help step through your idea.

Ask me one question at a time so we can develop a thorough, step-by-step spec for this idea. Each question should build on my previous answers, and our end goal is to have a detailed specification I can hand off to a developer. Let’s do this iteratively and dig into every relevant detail. Remember, only one question at a time.

Here’s the idea:

<IDEA>

You should use the best LLM available to you for this, as you want clarity of words & thought. As of this writing (Sept 1st 2025), Gpt5 & Gemini 2.5 Pro work great for this. I also maintain a leaderboard for wordcel evals – using any of the top models should work great.

One tweak I have added to Harper’s blog is the last step, where he suggests to wrap up – make sure to add “Give me the final output as XML” – for whatever reason this works great!

Development Environment

Lately I have been using vanilla VS Code + Amp. It’s pretty simple to install the plugin and get started – but this is a paid tool! I find I spend somewhere around $5/hr when I use it, which seems fine in the context of my other hobbies.

What’s great about having an environment this simple is that it works great on Windows, Mac, & Linux, so I can seamlessly switch between them based on how I work.

Once you have the environment set up (should take you like 5 minutes or less), you can get started with prompt. Create a new folder for your project, open VS Code, and pass the prompt from the previous step into Amp.

just vs code and amp!

Other Helpful Environment Add-ons

There are ton of niche tools you can add to your environment, so I am simply going to call out the ones I use regularly.

  1. DuckDB for data manipulation.
  2. uv for python environment management.
  3. node+npm for javascript environment management.

These three tools (frameworks? libraries?) will get you very far.

Interacting with your LLM while coding

If you have a simple idea and a good prompt, most of the time your LLM can “one-shot” it by getting you a working prototype with a single prompt. Your goal when building this stuff should always be to decompose your problem into individual, runnable steps, like the Agile car analogy of yore. This is particularly important with LLMs as every time you can run the program, you can also test it. This is critical because you want to keep these steps small so you can commit your changes into source control once whatever your current set of tests is, passes. Then you can give the LLM a new prompt to continue forward from a stable base.

Once I get the first runnable prototype, I do my first commit and spin up the repo in Github. Make sure to do this! LLMs are not deterministic and can destroy your project at any time, as has been memed numerous times on twitter (and a few weeks later, LinkedIn).

Testing

There are a couple of ways that seem to work well for LLMs with testing. I will outline two approaches.

Option 1: Just use the software

This is my typical approach. I just use the thing and modify it as I see fit, and fix bugs as I go. Once it works like I want it to, only then do I add tests – basically to make sure that future changes do not break existing functionality. Of course, LLMs are great at adding tests!

Option 2: Test Driven Development (TDD)

For more complex codebases, it seems like TDD is the way to go. I find it to be overkill for the hobby project but you can indeed write all your tests first and then let the LLM “solve” the tests. For complex logic puzzles (i.e. NFL tiebreaker logic), TDD is a much better approach than brute force by using the software and fixing bugs as they come up.

A practical example

I did a small project over labor day weekend to create a chrome extension to copy a screenshot and turn it into a csv. The below part will be very long because it is going contain the full idea honing transcript and prompt.

Initial Idea Honing Transcript

User: Ask me one question at a time so we can develop a thorough, step-by-step spec for this idea. Each question should build on my previous answers, and our end goal is to have a detailed specification I can hand off to a developer. Let’s do this iteratively and dig into every relevant detail. Remember, only one question at a time.

Here’s the idea: a chrome extension called “csv everything” that lets you take a screenshot and turn it into a csv, which is downloaded by your browser. behind the scenes it makes an API call (using your own API key) to open router with the image, then returns the result as csv. If it cannot identify if its a table, it should tell the user “couldn’t translate to csv”. It should also work with charts – basically it will give you the data points that allow you to build a line / bar / scatter chart.

Gemini: Of course. This is a great way to build a detailed spec. Let’s start with the user’s first action.

After the user installs the extension, how do they initiate the screen capture? For example, do they click the extension icon in the toolbar, which then allows them to drag and select an area on the page?

User: initially it takes whatever is in clipboard, verifies if it is an image, and if so, uses it. If its not an image, it tells the user “copy an an image of the data first”

Gemini: Got it. The trigger is clicking the extension icon, which then checks the clipboard.

Now, let’s think about the necessary credentials. The extension needs the user’s OpenRouter API key to make the API call.

Where and when does the user input and save their API key? For instance, is there a settings page, or does a field pop up the very first time they use the extension?

User: in the extension settings

Gemini: Excellent, a standard settings page is a clear approach.

Let’s refine the user flow for someone who is just getting started. A new user installs the extension, copies an image of a table, and clicks the extension icon, but they haven’t entered their API key in the settings yet

What should the extension do at this point? For example, should it show a message like “API key not found, please add it in settings” with a button that takes them there?

User: it show a message like “API key not found, please add it in settings” with a button that takes them there

Gemini: Perfect, that’s a very user-friendly flow.

Now, let’s move to the core action. The extension has the image from the clipboard and the user’s API key. It’s ready to make the call to the OpenRouter API.

What specific instructions (i.e., the prompt) should the extension send along with the image? For example, should it be a simple instruction like, “Extract the data from this image into CSV format, including headers if present”?

User: we can hone the prompt later as part of the testing. but yes, for now lets use a place holder “turn the image into a csv”. maybe the user can tune the prompt in their settings?

Gemini: Excellent idea. Making the prompt customizable in the settings is a powerful feature for advanced users. We’ll add that to the spec for the settings page.

So, the API call has been sent with the image and the user-defined prompt. This process might take a few seconds.

What should the user see in the extension’s interface while it is actively processing the image and waiting for a response from the API? For example, should it display a loading animation or a message like “Generating your CSV…”?

User: it should have a little spinner like the download icon in chrome – a dotted line spins around the outside of the icon.

Gemini: Got it, a spinner animation that mimics the native Chrome download icon will provide clear visual feedback.

Now, let’s handle the two possible outcomes of the API call. We’ll start with the successful one

The spinner stops, and the extension has received valid CSV data from the API. What should happen next? Should it automatically trigger the browser’s download functionality for the `.csv` file? And what should the default filename be?

User: “data.csv” is the default name, it should trigger the “save file” pop over

Gemini: Understood. The extension will trigger the standard “Save As” browser dialog, pre-filling the filename as `data.csv`.

Now, let’s cover the other scenario. The API call finishes, but the model could not interpret the image as a table or chart.

What exactly should the extension show the user in this case? For example, does the popup window display the text “couldn’t translate to csv” with a “Try Again” button?

User: it needs to identify error state – if there is an issue with the API call (anything other than 200) it should tell the user the specific error. if the image failed to turn into the chart but the API call succeeded, it should say “could not convert this image to a csv” – I think the challenge here is that “try again” implies that the image could be converted to a csv but in many cases it probably cannot be, because users are pasting in bad images.

Gemini: That’s an excellent distinction between an API error and a conversion failure. We will add that specific logic to the spec.

So, when the popup displays either an API error (like a 401 or 500) or the “Could not convert this image to a csv” message, how does the user dismiss it to end the current attempt? Should there be a simple “OK” button, or can they just click away from the popup to close it?

User: Yeah just hit an “X” in the corner. “OK” doesn’t do anything for the user in this case, so its more an FYI

Gemini: Perfect, an “X” to close notifications is clean and simple.

Now, let’s circle back to a key feature you mentioned in the original idea: processing charts. So far, the logic we’ve built is perfect for images of tables.

Does this same process and prompt—”turn the image into a csv”—also apply to images of line charts, bar charts, or scatter plots? Or does the extension need to handle charts differently in some way?

User: yes, the prompt should be the same. if the user wants a more sepcific prompt, they can tune in the settings

Gemini: Excellent, that simplifies the core logic. The extension will use the same process for all images and rely on the model’s intelligence and the user’s custom prompt for specific needs.

We have now defined the entire end-to-end user flow. Let’s refine one last detail on the settings.

OpenRouter provides access to many different AI models (like GPT-4o, Claude 3, etc.), which have varying costs and performance. Should the extension use a single, hard-coded model for all users, or should there be a dropdown menu in the settings page allowing the user to choose which model they want to use?

User: it should be set by default to gemini 2.5 pro, but allow the user to pass a model string from open router if they want.

Gemini: Got it. We’ll set a sensible default model but give the user full control to override it. That’s a perfect balance.

I believe we have now covered the entire specification from the first click to the final output, including settings and error handling. I will now compile everything into a final summary that you can hand off

Editor’s Note: Removed the plain text spec for brevity, as its included in XML below

This completes the specification. Does this detailed summary accurately reflect your vision for the extension?

User: give me this spec a xml please

Gemini: Of course. Here is the complete specification formatted as XML.

<?xml version="1.0" encoding="UTF-8"?>
<specification project="csv everything">
    <description>A Chrome Extension to convert images of tables or charts into a downloadable CSV file using the OpenRouter API.</description>

    <coreFunctionality>
        <trigger>User clicks the extension icon in the Chrome toolbar.</trigger>
        <inputSource>
            <source>Image from the user's clipboard.</source>
            <logic>
                <case condition="Image found">The process begins.</case>
                <case condition="Image not found">An error is shown (see Error Handling).</case>
            </logic>
        </inputSource>
        <processing>
            <api>OpenRouter</api>
            <details>The extension sends the clipboard image and a prompt to the API using the user's configured credentials.</details>
        </processing>
        <successfulOutput>
            <action>Triggers the browser's native "Save As" dialog.</action>
            <defaultFilename>data.csv</defaultFilename>
        </successfulOutput>
    </coreFunctionality>

    <userInterface>
        <state type="in-progress">
            <indicator>A loading spinner that visually mimics Chrome's native download icon.</indicator>
            <description>Displayed in the extension popup while waiting for the API response.</description>
        </state>
        <state type="notification">
            <dismissal>User clicks an "X" icon in the corner of the message box.</dismissal>
            <description>Applies to all error and informational messages.</description>
        </state>
    </userInterface>

    <settingsPage>
        <description>A standard options page for user configuration.</description>
        <fields>
            <field name="OpenRouter API Key">
                <type>Text Input</type>
                <required>true</required>
                <purpose>To securely store the user's API key.</purpose>
            </field>
            <field name="Model String">
                <type>Text Input</type>
                <required>false</required>
                <purpose>Allows the user to specify which OpenRouter model to use.</purpose>
                <defaultValue>google/gemini-2.5-pro</defaultValue>
            </field>
            <field name="Custom Prompt">
                <type>Text Area</type>
                <required>false</required>
                <purpose>Allows the user to define the prompt sent to the model. Used for all image types (tables, charts, etc.).</purpose>
                <defaultValue>turn the image into a csv</defaultValue>
            </field>
        </fields>
    </settingsPage>

    <errorHandling>
        <case condition="No Image in Clipboard">
            <message>Copy an image of the data first</message>
        </case>
        <case condition="Missing API Key">
            <message>API key not found, please add it in settings</message>
            <action>Provide a button that directs the user to the settings page.</action>
        </case>
        <case condition="API Call Failure">
            <trigger>API responds with a non-200 status code (e.g., 401, 500).</trigger>
            <message>Display the specific error returned by the API.</message>
        </case>
        <case condition="Conversion Failure">
            <trigger>API responds with a 200 status code, but the model could not extract data.</trigger>
            <message>Could not convert this image to a csv</message>
        </case>
    </errorHandling>

</specification>

From here I took that spec into an empty directory with VS Code and Amp, and they took it from there. I did end up modifying it so you could run it in the background, but otherwise the initial spec was bang on. You can take a look at the repo here: CSV Everything.

Interactions with AmpCode

The interactions with Amp once I had the spec were fairly trivial. Here are my iterative prompts with Amp, after the initial spec. My specific prompts are always in quotes and my commentary is unquoted, as such you will see some typos in my prompt.

  • “Next we need to build and test. how do i package it so i can load it in my chrome for testing?” (Once I had this answer, I immediately began testing this locally, and all the questions below all follow that line of thought)
  • I noticed it was building an icon in png, so I interrupted and said “Lets use SVG just so we can test”
  • “change the icon to be the text “CSV””
  • “in my testing, the response from gemini comes in markdown ““`csv <text> “` the markdown formatting shouldn’t be passed to the csv file that is created, so please strip that away. Also, the icon isn’t loading. I think we do need to render the png.”
  • I noticed it was using the system python to generate an icon so I stopped it and said “use uv instead”
  • “lets change the Icon to be bold, black text on a transparent background.”
  • “Ok, so when I click off the extension or change tabs, the api call is interuptted. Is there a way to make it run in the background once the conversion is started?”
  • “when its running, is there any way in indicate the extension icon is doing something to the user? like a little blue dot or something”
  • “I tried to use Gemini Flash, and it failed. Is that because 2.5 pro is a reasoning model and Flash is not?”
  • “Hmm, I’m not getting good enough error messages. Pro works, but flash doesn’t. I can see the API calls making their way to open router, but the response isn’t coming showing up. We should log the entire response from openrouter when we are in “debug mode”, which is a flag in the settings (enable debug mode : true/false), even if it is invalid.”
  • “seeing this error: Background conversion error: TypeError: URL.createObjectURL is not a function” (note: I also included a screenshot of the error)
  • “ok the indicator for when its running is way too big. It should a small blue dot in the top right of the icon area. 1/4 the size of current indicator.”
  • “change the text to a “down arrow””
  • “Ok so i am working on publishing to Chrome, and it is asking me why i need the “Host Permission” – can we rework this to work without that permission? If not, why?”

And here is a couple of demos!

Conclusion

As you can see you, it is fairly simple to get started with these tools but there is a lot of depth in How you use them. I hope this is helpful glance into the current state of how I am using them and that you find this type of journaling useful.

Exploring AI-Enhanced Development: My Experience with Codeium’s Windsurf IDE

AI-powered tools are transforming the way we code, and I recently got a chance to dive into this revolution with Codeium’s Windsurf IDE. My journey spanned two exciting projects: updating the theme of my mdsinabox.com project and building a Terraform provider for MotherDuck. Each project offered unique insights into the capabilities and limitations of AI-enhanced development. It should be noted that I did pay for the “Pro Plan” as you get rate limited really quickly on the free tier.

Project 1: Updating the Theme on mdsinabox.com

My first project involved updating the theme of my evidence.dev project. Evidence.dev is a Svelte-based app that integrates DuckDB and charting (via ECharts). Styling it involves navigating between CSS, Svelte, TypeScript, and SQL—a perfect storm of complexity that seemed tailor-made for Windsurf’s AI workflows.

I aimed to update the theme fonts to use serif fonts for certain elements and sans-serif fonts for others. Initially, I asked the editor to update these fonts, but it failed to detect that the font settings were managed through Tailwind CSS—a fact I didn’t know either at the time. We wasted considerable time searching for where to set the fonts.

the windsurf editor using cascade (right pane) to update the code

After a frustrating period of trial and error of pouring over project internals, and later reading documentation, I realized that Tailwind CSS controlled the fonts. Once I instructed the editor about Tailwind, it identified the necessary changes immediately, and we were back on track.

updated theme on the nba team pages

However, one gripe remained: Windsurf’s model didn’t include the build files for the Evidence static site, so I had to manually copy files to another directory for it to work. Additionally, debugging errors using the browser’s source view proved more efficient than relying on the editor. These limitations were a bit frustrating, but the experience highlighted the importance of understanding your project’s architecture and guiding AI tools appropriately. Access to a browser emulator would massively improve the debugging experience.

Project 2: Building a Terraform Provider for MotherDuck

The second project was sparked by a potential customer’s request for a Terraform provider for MotherDuck. While I was familiar with Terraform conceptually, I’d never used it before. With the recent launch of our REST API at MotherDuck, this felt like the perfect opportunity to explore its capabilities.

I instructed Windsurf, “I want to make a Terraform provider. Use the API docs at this URL to create it.” The editor sprang into action, setting up the environment and framing the provider. While its initial implementation of the REST API was overly generic and didn’t work, the tool’s ability to see the entire codebase end-to-end made it relatively straightforward to refine. I did have to interject and say “here is an example curl request that I know works, make it work like this” which was enough to get it unstuck.

intervening with cascade to tell it to change directory instead of run go init (again)

As an aside, observing it at times was quite comical as it seemed to take approaches that were obvious incorrect, especially when I was dealing with some invalid authorization tokens. It would almost say “well I trust that my handler has given me a valid token, so it must be something else” and just start doing things that were obviously not going to work.

Anyway, once the main Terraform file was built, I tasked the editor with writing tests to validate its functionality. It recommended Go, a language I had no prior experience with, and even set up the environment for it. Through a mix of trial and error and manual intervention (particularly to address SQL syntax issues like the invalid ‘attach if not exists’ statement in MotherDuck), I managed to get everything working. From start to finish, including testing, the entire process took around four hours—which seemed pretty decent given my experience level.

Conclusion

My experience with Codeium’s Windsurf IDE revealed both the promise and the current limitations of AI-enhanced development. The ability to seamlessly navigate between languages and frameworks, quickly scaffold projects, and even tackle unfamiliar domains like Go was incredibly empowering. However, there were moments of friction—misunderstandings about project architecture, limitations in accessing build files, and occasional struggles with syntax. Getting these models into the right context quickly is pretty difficult with projects that have lots of dependencies and overall my projects are fairly low complexity.

Still, it’s remarkable how far we’ve come. AI-enabled editors like Windsurf are not just tools but collaborative partners, accelerating development and enabling us to take on challenges that might have otherwise seemed impossible. As these technologies continue to mature, I can’t wait to see how I can use them to build even more fun projects.