Ben discusses Vibe coding. Is AI all smoke and mirrors?

I’ve been interested in technology my whole life. From the moment my Dad brought home a C64 in 1983, I was hooked. Mostly, I’ve been a user of technology, I dabbled in coding when at university in the early 90s, though not enough to produce anything that worked. It was always a regret as throughout my career, first in finance and then as a business psychologist, I’ve had many ideas for solutions that required coding skills.

Those ideas have stayed as ideas, and I’ve moved on. That is until now.  

In April, I met up with an old friend, who just happens to be an experienced programmer, and he mentioned this “thing” called “vibe coding” and a platform called Replit.

The idea takes root

I looked it up and the sales pitch is this – Just type a description of what you want to build into the prompt box, it could be an app or a game or website, and Replit will build it for you. You supposedly need zero knowledge or experience; you just need to tell it what you want and away it goes!

After some brief research I found out that there were various vibe coding platforms to choose from, with different levels of complexity. I settled on the Google platform, Firebase Studio (which is powered by Google’s AI model, Gemini), mostly because the backend elements such as authentication, storage, and database, are provided seamlessly. As a complete novice, having everything in one place gave me so much confidence (in truth I don’t think it's that hard to sort out backend stuff with other platforms).

It can’t be that hard…

My first idea involved building a platform to host our Situational Judgement Tests. We’ve developed a variety of assessments specifically for the waste and recycling sector, and I was curious to see what was possible.

For my first attempt, I typed away, describing what I wanted, and pressed enter – Firebase Studio jumped into action, considered what I’d asked for, told me what it was going to do, including the look and the colours. Then I simply needed to say yes, crack on, and whoosh, it was off, lines and lines of code being produced in a matter of seconds. It was like watching the computer screens in The Matrix, with lines of code simply appearing! It was fascinating and exciting to see!

Dreams and imaginations do not Apps make

Firebase Studio finished and posted the results for me to see. At first you are in the prototype stage, where the initial app is produced and you can play about with it locally, make changes, add things, remove things, before deciding to publish it with a real-life working webpage.

The initial result was both exciting and disappointing. It sort of looked like I wanted it to look, and some aspects worked.  I could add a new assessment which took me through to another page where it could be constructed, but there was no functionality in that page, the app didn’t work. I also didn’t like the structure of the app and realized that there were so many things I hadn’t thought about.

In fact, I’d basically jumped in and played about a bit, which is fine, I think “free-swimming” with these things is a great way to learn how to use them, but if I wanted more then I needed a different approach.

More structure, more thought

I realised that if I was going to build an actual platform to host my assessments, then I needed to think like a developer.  I needed to be structured and considered in my approach. So, I mapped it out on a word document, outlining in detail the functionality of every aspect of the app. How it linked together, what I wanted to be able to do and why.

Having finished, I copied the whole document and pasted it into ChatGPT. I told it what I was doing and asked if it could improve the structure of the prompt for Firebase Studio. Amazingly, this provided some great improvements to the prompt, and I pasted it straight into Firebase Studio and then sat back as it whooshed into action again.

Version 2 results

This time the results were so much better. The app structure and functionality reflected what I’d described, and it was so much easier to give Firebase Studio feedback on what to change and what to add. Having the history of the conversation means that Firebase Studio understands the context and is better able to respond effectively.

So, job done then, perfectly working platform to host our SJTs? Well, no, if only things were that straightforward.

Having published the app, I wanted to start using it, and it wasn’t working. I kept receiving errors when I tried to add a new assessment. It turned out there was no “backend” set up, and Firebase Studio hadn’t told me that I needed to do this. If you have any knowledge of app development, you know what needs to be done, and I have zero knowledge. I was literally learning on the go.

Errors – rinse and repeat

Here’s what happened, and, for me, this reflects the limitations of current AI agents. I told Gemini the error I faced, and it instantly jumped into action: I know what the issue is, this is the definitive fix, you just need to republish, and it’ll be fixed. Except it wasn’t, and we repeated this process numerous times before I asked if there was anything else that needed to be done, such as the backend. Oh yes, of course, Gemini said, you need to set up this, that and the other….

We appeared to have success and the platform was working, which was amazing, but then bugs started to appear. I didn’t understand them and Gemini would jump to conclusions as to the cause of the bug, tell me that this would be the ultimate fix, and then it would come crashing down and offer a groveling apology every time it didn’t fix the bug.

The current limitations of AI

One of the biggest takeaways in trying to develop the app is that there are clear limitations with Gemini, and probably other platforms.

Gemini makes errors with the code, simple errors, which it often spotted later, but you can go round in circles as it spots one error and then makes another. It also doesn’t question or think systematically, so your prompts need to be effective.

Learning to write effective prompts is a major driver of success, but it’s currently not enough. Gemini’s inability to step back from a problem, even when I told it to, was a consistent limitation. It is so excited to help, and with the backend functionality, it provided wonderful step by step instructions that I would have been lost without, but it jumps to conclusions too quickly.

It’s as if there might be 10 reasons for a bug and it goes to the first possible reason and tells you, this is the cause and this will fix the problem. Then when it doesn’t fix the problem, it jumps straight to the next possible cause, and so on, until it either stumbles upon the solution or runs out of options.  

The interactions become tedious, and I turned to ChatGPT for help, copy and pasting conversations and code. This led to silly errors being spotted and prompts to move things forward, but it was only helpful up to a point.

It’s all in the logs

I had one bug that Gemini simply couldn’t fix, even involving ChatGPT, and in the end it told me, “I can’t help you, you need to ask a human programming engineer for assistance.”

Are you kidding me, I thought! Rather than give up, I asked questions about what other information I could share that might help to fix the bugs and whether there were other places where I could get more information.

Immediately, the answer came back, “Yes, supply me with the log reports.”

I couldn’t quite believe what I was reading, possibly 2 or 3 weeks into the development of the app suddenly it was telling me about the Google cloud console. This is where the really complicated stuff happens, and one functionality is log reports. Every time you do anything in the app, that action and the internal actions of the app and backend are recorded. When an error happens, it is highlighted in red, and it’s possible to open the log report and copy all the details. Gemini doesn’t have access to the console, so I started copying the logs and sharing them to help Gemini understand what was happening.

You would think this helped us move along quickly, but it didn’t. Gemini still jumped to conclusions too quickly, I ran solution past ChatGPT who spotted silly mistakes, and we’d go round in circles, with Gemini always thanking me for my knowledgeable feedback and wisdom, which would make me smile.

We spent three days trying to fix one bug, two days with another one. We got there in the end, and the app works now, but I have further functionality I want to add. Honestly, I am a bit terrified about the potential challenges ahead.

Is AI intelligent?

Firebase Studio and Gemini are clever. Really clever. However, in my opinion they’re not intelligent. This form of AI is not taking over coder jobs any time soon. I went into this experience believing the hype, that coders could be losing their jobs within 6 months, but I’ve changed my mind.

My coder friend is naturally cynical and always laughs off these claims about coders losing their jobs. After it took 3 days to fix that one bug, he teased me saying, “Do you still think it’s intelligent?”.

There’s a lot more I want to say about AI, but I’ll save that for another article.

At a personal level, I’ve loved the journey. These tools give us more control over the solutions we provide and improve efficiency. I’ve got plenty of other ideas for apps, including a game-based assessment that’d been bubbling for a few years. It had been on hold because I didn’t know how to code. Now maybe that’ll change.