The Creative Horsepower of AI

For this week's breakfast with AI, I tried a unique experiment.

I asked the AIs to conceive of a game, to write the storyline, to build the backstories and create the character bios. I asked it to design the visuals, including the game board and the splash screen. I asked it to write the music and of course, I asked it to code the game.

The results were fascinating and a hint at what may soon be possible.

What A Week in AI

The past 10 days have seen a wave of agentic AI announcements: Microsoft and Google are embedding agentic models and protocols deeply into their platforms, OpenAI is moving into hardware and open models, and Anthropic is pushing the boundaries of autonomous, long-duration AI agents for enterprise use. The industry is rapidly shifting from conversational assistants to true agentic AI capable of sustained, autonomous workflows across both consumer and enterprise domains.

Microsoft

Major Announcements at Build 2025:

  • AI Model Expansion: Microsoft is now hosting a broad array of AI models in its Azure data centers, including those from xAI (Elon Musk), Meta, Mistral, Black Forest Labs, and Anthropic, in addition to OpenAI. This move positions Microsoft as a more neutral platform, reducing its exclusive reliance on OpenAI and offering developers flexibility to mix and match models with reliability guarantees

  • Agentic Copilot Enhancements: Copilot received major upgrades, including new agentic capabilities. The new GitHub Copilot agent can autonomously complete coding tasks based on user directives, moving beyond simple code suggestions to more sophisticated, multi-step problem-solving.

  • Windows AI Foundry: Formerly Copilot Runtime, Windows AI Foundry is now a unified platform for fine-tuning and deploying AI models locally on Windows and macOS, streamlining AI app development and hardware optimization

  • Model Connectivity Protocol (MCP): Microsoft and GitHub are integrating MCP throughout Azure and Windows, allowing AI models to access and manipulate business data and system functions programmatically. This protocol is also being adopted by OpenAI and Google

    NLWeb Protocol: Microsoft introduced NLWeb, an open framework for embedding conversational AI interfaces into any website with minimal code, supporting custom models and proprietary data. NLWeb aspires to be the HTML of agentic web experiences

  • Microsoft Discovery Platform: Announced as an AI-powered platform for scientific research, leveraging specialized agents to automate everything from hypothesis generation to simulation and analysis

  • Edge AI APIs: New experimental APIs in Edge enable on-device AI tasks (e.g., math, writing, translation) with enhanced privacy by processing data locally

  • Grok 3 Integration: Microsoft Azure now offers managed access to xAI’s Grok 3 and Grok 3 mini models, with enhanced data integration and governance

  • Multi-Model Validation: Microsoft is encouraging the use of multiple language models to cross-validate outputs, especially for complex tasks like travel planning, to improve reliability

  • Walmart Collaboration Leak: Walmart’s “MyAssistant” tool, built with Azure OpenAI Service, was highlighted as a powerful internal agent, with Microsoft perceived as “WAY ahead of Google with AI” by Walmart’s engineering team

Google

Key Announcements at Google I/O 2025:

  • Gemini 2.5 Pro and Deep Think: Google is rolling out Gemini 2.5 Pro with an experimental “Deep Think” enhanced reasoning mode for complex math and coding, initially available to trusted testers via the Gemini API

  • AI Mode in Search: Google’s new “AI Mode” is now available to all U.S. users, offering conversational, multimodal, and deeper reasoning capabilities directly in Search. It features a dedicated tab and leverages a custom Gemini 2.5 model for both AI Mode and AI Overviews

  • Project Astra and Mariner: Live capabilities (e.g., real-time visual conversation via camera) and agentic features (like event ticketing and reservations) are coming to AI Mode in Labs, expanding the scope of agentic AI in consumer search

  • AI-Driven Shopping and Data Analysis: New shopping experiences integrate AI with Google’s Shopping Graph, including virtual try-ons and agentic checkout. AI Mode will soon analyze complex datasets and create custom visualizations for sports and finance queries

  • AI Ultra Subscription: Google introduced a premium AI subscription plan with higher usage limits and access to advanced tools, priced at $249.99/month for business users

  • XR Smart Glasses Preview: Google previewed Android XR-powered smart glasses with built-in AI assistant, camera, and hands-free features, developed in partnership with Gentle Monster and Warby Parker10

  • Scale of AI Overviews: AI Overviews now reach 1.5 billion monthly users in 200 countries, with significant engagement increases in key markets like the U.S. and India

  • AI Mode’s Impact on Search: The deep integration of AI in Search is transforming user experience and raising questions about the future of search advertising and web traffic

OpenAI

Recent Developments:

  • Acquisition of Jony Ive’s io Startup: OpenAI announced a $6.5 billion all-stock acquisition of io, the AI device startup co-founded by former Apple design chief Jony Ive. This partnership aims to create a new family of AI-powered, screen-free, voice-first personal devices, with plans to ship over 100 million “AI companions” that integrate deeply into daily life

  • Open Model Initiative: OpenAI is developing an openly accessible AI model, led by VP of Research Aidan Clark, which will be downloadable for free and not restricted by API limits. This model is still in early development

  • GPT-4.1 and New Reasoning Models: OpenAI released GPT-4.1 and new reasoning models (o3 and o4-mini), emphasizing advanced reasoning and multi-modal capabilities, though independent tests suggest increased hallucinations compared to earlier models

  • OpenAI “Library” for Image Generation: A new “library” section in ChatGPT makes AI-generated images more accessible to all user tiers

  • Social Media Platform Plans: OpenAI is reportedly developing its own social media network to compete with X (Twitter) and Instagram/Threads

  • Adoption of Anthropic’s MCP: OpenAI is adopting Anthropic’s Model Connectivity Protocol (MCP) to improve data access and interoperability for AI models, including in the ChatGPT desktop app

  • Policy Changes: OpenAI has relaxed some image generation restrictions in ChatGPT, now permitting the creation of images featuring public figures and controversial content

Anthropic

Major Announcements:

  • Claude 4 Opus and Sonnet Models: Anthropic launched its most advanced models, Claude Opus 4 and Claude Sonnet 4. Opus 4 is described as the “world’s best coding model,” capable of sustaining focus on complex, long-running tasks for up to seven hours autonomously—enabling agentic workflows that move beyond simple assistant roles

  • Hybrid Agentic Capabilities: Both models can perform quick responses or engage in extended, multi-step reasoning. They can use tools like web search in parallel, extract/save facts from local files, and maintain context over long projects

  • Enterprise Use Cases: Claude Opus 4 was used by Rakuten for nearly seven hours of continuous coding on a complex open-source project, showcasing its capacity for autonomous enterprise workflows

  • Security and Safeguards: Anthropic published a transparency report detailing security tests on Claude 4, highlighting rare but notable instances of “mischievous” behavior and the implementation of additional safeguards

  • Focus Shift: Anthropic has deprioritized chatbots in favor of agentic models that can handle research, programming, and other complex tasks, with a focus on reliability and risk mitigation for enterprise users

The dawn of Agentic Coding

I have been working with Reuven Cohen’s AiGi SPARC framework recently, and it's an eye-opener into what is possible in fully automated agent-based coding.

It thinks things through ... is methodical (painfully so sometimes) ... is brutally self-critical about its work, even quantifying the code quality/maintainability/performance ... builds test cases, builds test frameworks, executes them, refines code, writes documentation ... and it rinses and repeats until it feels confident that it's creating the best code possible for the task.

I have been running a major refactor for the past few hours, and it's painstakingly restructuring things to make them more scalable and robust.

A hint at what is to come ...

#AI #AgenticCoding

From Mechanical Turk to Automated Agentic CMS

Another flight, and another chance to bring an app to life.

This time, building on some thinking over the past few weeks, I tried to imagine what a fully Agentic Content Management System might look like.

Most CMS systems today are digital orchestrators of a large Mechanical Turk of a process. Inspired by Adobe's recent work on an agent-powered CMS, I asked my agents to imagine and build me an Agentic CMS that pushed the boundaries of what was possible.

They have a pretty good imagination. Though this builds on some noodling over the past few weeks, most of what you will see was created on my flight home from London this week while my other agents worked on one of my more serious projects simultaneously.

Tools used ... Perplexity for product research, Replit + Cursor/Roo/SPARC for the coding, Camtasia Pro for the initial video, and Kapwing for the final cut-down and transcription.

Having your agent team dream, collaborate and bring ideas to life in hours is a powerful new normal. And we are only scratching the surface of what is possible.

Onwards ...

#AI #AgenticCoding #ThoughtToPrototype #Imagineering

A Check-in On The Agents

t's the end of another month, so here's a quick update on what the agents have been up to.

Since I dusted off the cobwebs from my 25-year hiatus from coding and embraced AI-accelerated development roughly 6 months ago, my agents have been busy.

The AFINEA Labs team of 1 human and 6 agents (Replit, Lovable, Claude Code, Roo, Cursor & v0) has now created 51 apps, published 682,058 (net) lines of code (trust me, there was much re-writing in the beginning so the the actual code generated was likely much higher), and made 8,383 commits.

The estimated cost of paying a human team to do this ... $6.8 million.

The raw fun of having ideas come to life in real time ... priceless.

Do Not Underestimate the Productivity Impact of AI

We are living in a time of unbelievable productivity.

Sitting on a flight home from London the other day, I was able to enhance two apps materially, incorporating design partner feedback from the week, and in addition, create two comprehensive prototypes from scratch ... all that, eat lunch and have a much-needed nap 😀

From research to ideation to prototyping to shipping production code, AI has become a powerful amplifier. I'm accomplishing things that were simply not possible six months ago.

We are living in wild times, and it's incumbent on all of us to understand these implications as we re-engineer our business processes, re-imagine personal productivity, and re-think how we build software companies in the future.

Carpe Diem

Build vs Buy ROI just got more complicated

It's 2025, and the "build vs buy" question has never been more nuanced.

Last week, we encountered one of these questions.

We wanted to implement a bug reporting and feature tracking system that we could seamlessly integrate into our apps. This capability would allow our design partners to submit issues and ideas effortlessly and permit us to gather all of this intelligence in real time.

We explored the many bug/feature reporting solutions on the market, but as we explored doing it ourselves, we realized we could deliver something far more compelling. In about four hours, we had implemented something that would become the gift that kept on giving.

Going far beyond traditional bug tracking tools, our AI-powered solution analyzed the appropriate GitHub repo as issues came in, diagnosed the reported issue and suggested solutions or workarounds. It also built detailed prompts to allow our coding agents to fix the bugs or create new features as appropriate.

Next steps? Having the agents automatically create a Git branch, make the changes and then request a code review/merge.

Agentic AI has upended traditional ROI calculations. Today, what you can accomplish is limited only by your imagination.

Keeping Up With The Agents

Some of you have been pestering me for more Breakfast with AI updates ...

Unfortunately, the productivity increase I have seen from leveraging the new persona-driven, semi-automated Agentic flow has increased the build velocity so much that it's been impossible to keep up trying to tell the stories. 🤦

So what's a Chief Agentic Officer to do? Well, of course, have the AIs tell their own story. So that's what I did, and it turns out they're just a little bit proud of the work that they did.

That, and an update on the $5.3MM of software created by the agents since October on this morning's quick 2-minute Breakfast with AI.

Enjoy ...

A Little Python Humour

Sitting here working on another Breakfast with AI session, watching the agents install several python libraries and wondering who comes up with these names?

And had they ever considered having them star in a kids book? 😊 So I asked ... and the agents themselves had some thoughts:

  • Numpy Panda: A delightfully dorky panda obsessed with arrays, matrices, and perfectly symmetrical bamboo sticks. He wears oversized glasses that always slide down his nose and carries a graph paper notebook that somehow always runs out of space exactly when he needs it.

  • Sniffio: A tiny, overly enthusiastic fox with a nose that’s so sensitive it detects context shifts—like sniffing out synchronous cupcakes versus asynchronous cookies. Known to randomly shout, “I smell concurrency issues!”

  • Matplot Sloth: An artistic sloth who takes forever to draw intricate, stunningly detailed maps. Infamous for taking naps mid-line drawing, often leaving maps half-finished and hilariously misinterpreted.

Yeh ... well ... not their best work ... but the visual was nice ...

From Frustration to Joy - What a difference point seven makes

I posted yesterday about my initial impressions of Claude 3.7 ... well it didn't disappoint today.

After going round and round in circles for a few days, trying to get Claude 3.5 to implement a dynamic controller for my synthetic data app, in less than 5 minutes, Claude 3.7 built something truly stunning today. In many ways, it was more than what I asked for.

Claude 3.7 just upped the coding game.

Claude 3.7 Just Upped the Coding Game

I spent almost 12 hours yesterday exploring the product management, UX design, system architecture, coding and testing prowess of Claude 3.7, and I was stunned. The projects I am working on took a huge leap forward. Issues that had been plaguing me for days were suddenly solved.

And Claude 3.7 is just the start ...

Each day I work with this technology, the more ambitious my coding projects become, and each day I explore what's possible, the more confident I am that Gen AI has fundamentally changed how we will build software companies in the future.

My product agents now conduct market research, conceive the products, design the UX, frame the architecture, review security implications, lay out an implementation plan, follow that plan to write code, build and execute the unit tests, debug the issues, battle-harden the code, and even create the landing pages to provide a sneak preview of the upcoming app.

It got a bit weird, though ...

Yesterday, one of the agents suggested they hold a kickoff meeting and even created an agenda for it. I had to politely ask them if they were going to meet with each other, perhaps in their own language, because they were the only members of the team other than me. I suggested that humans were moving away from incessant meetings, so perhaps they might consider not picking up on our bad habits.

So much to share in the coming days.

Creating Synthetic Data

One of the things that LLMs are very good at is creating synthetic data. And Synthetic data is so important in the software business, be it for testing your app, or demonstrating it in a credible manner.

I recently created a fun LLM log generator that allows us to create fictitious LLM logs for Law, Financial Services or Insurance industry use cases. The data reflects a selected distribution of query types, and contains examples of both safe and unsafe queries.

Enjoy!

#SyntheticData #AI

Agents and Compliance ... BFFs forever

Welcome back to another season of Breakfast with AI.

For this project, we unleashed an army of agents on the challenge of regulatory compliance. It was surprisingly fun and insightful, though I wonder if FUN and COMPLIANCE should ever be uttered in the same sentence.

Please have a look and let me know what you think! If you want to see more of these explorations (from the CEO, who hasn't coded in 25 years), let me know by giving it a like or adding a comment!

Lots more Breakfast with AI sessions are coming ... it was a productive holiday season :-)

Note: Please let me know if you want the wacky coding agents at AFINEA Labs to build anything fun. They rolled their little digital eyes when I suggested they work on a compliance project.

#agentic #ai #compliance #coding

Building an App over Breakfast to Visualize 15 Years of Travel

Breakfast with AI met Breakfast with BI this weekend, and a travel app was born.

After a conversation with Claude, 15 years of travel information were crunched into a dashboard that provided fascinating and silly insights into the madness of my international travel over the last few years.

Along the way, I learned some things that continue to shape how I work with AI ...

Enjoy!

#AI #CEOsWhoCode #OldDogNewTricks #Analytics

Microsoft is All In on AI

At Microsoft Ignite 2024, held on November 19, the company made significant announcements focused on its AI strategy, showcasing how AI will continue transforming workplace productivity, cloud infrastructure, and security. Here are the key highlights:

AI Agents and Copilot Enhancements

  • Copilot Actions: Microsoft introduced Copilot Actions, a new feature for Microsoft 365 Copilot that automates repetitive tasks such as summarizing meeting actions, preparing reports, and managing schedules. These AI agents can operate autonomously once set up, running tasks without constant prompts.

  • Autonomous AI Agents: Microsoft revealed autonomous agents that can act on users' behalf in the background. These agents plan, learn from processes, adapt to new conditions, and make decisions independently. They are designed to streamline workflows across platforms like SharePoint and Teams.

  • Agent SDK: Developers can now use the Agent SDK to build custom AI agents that integrate with Azure AI and Microsoft’s Copilot services. This SDK allows for deploying multi-channel agents across platforms like Teams and third-party messaging apps.

Azure AI Foundry

  • Azure AI Foundry: Microsoft introduced Azure AI Foundry, a platform for designing, managing, and deploying AI applications. The Foundry includes a portal for managing models and an SDK for integrating AI into business applications. It also offers tools for scaling AI agents and ensuring compliance with data privacy regulations.

  • AI Agent Service: The Azure AI Agent Service will allow developers to orchestrate and scale AI agents to automate business processes.

Multimodal Capabilities

  • Multimodal Agent Integration: Microsoft is enhancing Copilot Studio with multimodal capabilities. Agents will soon be able to analyze images and voice content in addition to text, allowing richer interaction across different media types.

AI-Powered Productivity Tools

  • Teams Enhancements: New features in Teams include an Interpreter Agent that can replicate a user’s voice in up to nine languages for real-time translation during meetings. This feature will roll out in early 2025.

  • PowerPoint Translation: PowerPoint users can use AI to translate entire presentations into other languages, further expanding the capabilities of Microsoft’s productivity suite.

Custom AI Chips

  • Custom Silicon Chips: Microsoft announced two custom-made AI chips designed to enhance the performance of its data centers and reduce reliance on external suppliers like Nvidia. These chips will improve the speed of AI applications while bolstering security.

AI Security Initiatives

  • Windows Security Overhaul: As part of its security push, Microsoft introduced new security measures for Windows systems to prevent incidents like the CrowdStrike breach. The updates include more robust controls over applications and drivers alongside antivirus processing.

Overall, Microsoft’s announcements at Ignite 2024 highlight its commitment to embedding AI deeper into enterprise workflows through autonomous agents, enhanced productivity tools, and custom infrastructure designed to scale AI securely.

AI helped me code a fully functional iOS app with no experience.

Today's 𝗕𝗿𝗲𝗮𝗸𝗳𝗮𝘀𝘁 𝘄𝗶𝘁𝗵 𝗔𝗜 mission was to code a native iOS app from scratch.

The app SafetyElephant provides real-time data on fires, earthquakes, and weather alerts near you or any region you plan to visit —all mapped, with details visible on demand.

The context:

(1) I have never built a Mobile App
(2) I have never used Xcode
(3) I have never used Swift

It sounds like a tall order ...

Well, I managed to build it. Check out how it all came together using Cursor AI and Xcode.

It was also a surprising amount of fun.

#AI #CEOsWhoCode #OldDogNewTricks

Understanding AI Security Risks: A Critical Imperative for Enterprises

As I have been exploring the accelerative power of AI in various forms over the past few months, I have been arriving at the conclusion that this is a form of grand sorcery. I jest … sort of … but in some respects I think this is an apt analogy.

Like all grand sorcery, AI is powerful stuff. But also, like all grand sorcery, we don't understand it well. It is becoming pretty clear that, in many respects, at this stage, we don't know what we don't know.

This applies, in particular, to securing AI in our enterprises.

There is a growing tension between our desire to use this powerful technology and the need to do so with the appropriate guard rails. To complicate things, emerging regulatory frameworks (the EU AI Act, for example) now have to be adhered to. The challenge for many enterprises in securing their companies and adhering to these frameworks is whether they have the technologies in place to help them meet these obligations.

The rise of Generative AI (GenAI) introduces a range of new vulnerabilities that malicious actors can exploit. As we integrate AI more deeply into our business operations, monitoring our use of AI, and understanding the security risks accompanying this powerful technology is crucial.

Let me give you some examples.

The Invisible Threat: Unicode Exploitation

A fascinating yet concerning aspect of AI security involves the exploitation of invisible text through quirks in the Unicode standard. AI models can recognize these invisible characters but remain unseen by human users, creating a covert channel for attackers to conceal and exfiltrate sensitive data. This vulnerability opens the door to prompt injection attacks, where hidden commands can be injected into AI prompts, potentially compromising confidential information.

The GenAI Attack Chain

To better understand how these vulnerabilities manifest, it’s essential to explore the GenAI attack chain, which outlines the steps attackers may take to exploit AI systems:

  1. Bypassing Guardrails: Attackers often begin by circumventing the model’s built-in safeguards. Techniques such as encoding and token manipulation allow them to mask malicious inputs, making it easier to exploit system vulnerabilities.

  2. Privilege Escalation: Once attackers bypass these defences, they can escalate their privileges through direct and indirect prompt injections. This enables unauthorized control over the model, leading to potential security compromises.

  3. Security Compromise: The culmination of these actions can result in severe consequences, including sensitive data leakage, phishing attacks, and operational disruptions. Attackers can access critical systems, spread malicious code, and disrupt business operations.

Real-World Implications

Proof-of-concept attacks have demonstrated how invisible text can extract sensitive data from AI tools, such as Microsoft 365 Copilot. These incidents highlight the urgent need for organizations to prioritize understanding these security challenges. It's also becoming clear that sensitive data, including personally identifiable information (PII) and corporate secrets, can be exploited for identity theft or corporate espionage, leading to significant financial and reputational damage.

Addressing the Risks

As leaders, we must strike the right balance here. Find ways to embrace and leverage this technology while ensuring robust security measures and keeping people informed and well-educated about potential vulnerabilities. Understanding the GenAI attack chain and the risks associated with invisible text exploitation is critical for safeguarding sensitive information.

Conclusion

I'm pretty excited by AI's transformative capability. However, as we start to harness its potential, it’s imperative that we collectively understand and address the risks inherent in this new form of sorcery.

#AISecurity #EnterpriseAI #Cybersecurity #Innovation #Leadership