Unleashing your AI CPO

Mark Ruddock June 29, 2025

A few weeks ago, I had a discussion about "table top exercises" and their utility in helping train internal teams to respond to cyber attacks. I was curious about space and so I had the agents build a very simple app that helped companies customize table top exercises, execute them with their teams, and score the responses.

For this weekend's breakfast with AI, I asked the agents to dream further ... to imagine far beyond what they had built and come up with something that would have no competitive peer in the industry.

I basically turned them into Chief Product Officers, and gave them the mandate of building something unique.

What they came up with was pretty interesting, and will be the topic of my next "Breakfast with AI" video:

⚔️ AI-Powered Red Team Integration: Dynamic adversary simulation with configurable threat actors (nation-state, ransomware groups, insider threats) that adapt tactics based on defensive responses

🌊 Cascading Incident Simulation: Multi-system failure modelling across supply chains, market-wide events, and infrastructure with real-time financial impact calculations

🏢 Physical-Cyber Convergence: Integrated physical and cybersecurity crisis simulation addressing facility security, manufacturing floor attacks, and critical infrastructure

🤝 Multi-Organization Coordination: Complete inter-company crisis coordination with regulatory authorities, law enforcement, media, and vendor relationships

🗣️ Advanced Voice Crisis Simulations: Real-time multi-character conversations with specialized AI personas (Physical Security Director, Facilities Manager, Emergency Coordinator)

🎯 Strategic Decision Analysis: Executive-level crisis decision simulation with financial impact modelling, regulatory compliance, and business continuity trade-offs

🎯 Live Crisis Command Center Emulation: Professional-grade real-time crisis coordination dashboard, with executive-level visibility across multiple organizations during active incidents with threat level monitoring and financial impact tracking

🧠 Predictive Crisis Intelligence: Machine learning models that forecast team performance degradation 30 minutes in advance with confidence intervals

📊 Readiness Analytics Dashboard: Comprehensive ML-powered performance tracking with organizational resilience scoring and industry benchmarking

And it actually runs ...

Could what started as a thought experiment, have now evolved into the world's most advanced crisis training application?

This experience was wild ...

"Breakfast with AI" projects like this show me what's possible when we combine human curiosity and vision with AI-powered research and code generation.

Thrilling ... for me at least.

Video coming next week.

AI News Roundup

Mark Ruddock June 29, 2025

Good Sunday Morning ... it's time for my speedy roundup of the key AI happenings of the week:

Interesting Model News

🔹 𝘎𝘗𝘛-5 is officially on track for a summer launch – OpenAI says it's “materially better” than GPT-4
🔹 𝘔𝘪𝘥𝘫𝘰𝘶𝘳𝘯𝘦𝘺 debuts its first text-to-video model, taking on Sora and Runway
🔹 𝘔𝘪𝘯𝘪𝘔𝘢𝘹 𝘔1 goes open-source under Apache 2.0 – lower compute, high performance
🔹 𝘔𝘪𝘴𝘵𝘳𝘢𝘭’𝘴 𝘋𝘦𝘷𝘴𝘵𝘳𝘢𝘭 fuels open-source dev tools innovation
🔹 𝘔𝘪𝘤𝘳𝘰𝘴𝘰𝘧𝘵 𝘶𝘯𝘷𝘦𝘪𝘭𝘴 𝘕𝘓𝘞𝘦𝘣 – open protocol for embedding AI on websites
🔹 𝘌𝘷𝘦𝘳𝘺𝘰𝘯𝘦 𝘭𝘰𝘷𝘦𝘴 𝘷𝘪𝘣𝘦 𝘤𝘰𝘥𝘦𝘳𝘴 – you can now vibe code and and host your app in Claude
🔹 𝘗𝘦𝘳𝘱𝘭𝘦𝘹𝘪𝘵𝘺 𝘳𝘰𝘭𝘭𝘦𝘥 𝘰𝘶𝘵 𝘯𝘦𝘸 𝘧𝘦𝘢𝘵𝘶𝘳𝘦𝘴 – bolstering research and productivity features (reports, slideshows etc.)

Big Bets, Startups & Market Moves

💰 𝘚𝘰𝘧𝘵𝘉𝘢𝘯𝘬’𝘴 “𝘊𝘳𝘺𝘴𝘵𝘢𝘭 𝘓𝘢𝘯𝘥”: $1T AI + robotics hub planned in Arizona
💰 𝘔𝘪𝘳𝘢 𝘔𝘶𝘳𝘢𝘵𝘪’𝘴 𝘛𝘩𝘪𝘯𝘬𝘪𝘯𝘨 𝘔𝘢𝘤𝘩𝘪𝘯𝘦𝘴 𝘓𝘢𝘣 raises $2B at $10B valuation
💰 𝘔𝘦𝘵𝘢 wild attempts to grab top talent, plans $65B in AI infra spending
💰 𝘈𝘱𝘱𝘭𝘦 & 𝘗𝘦𝘳𝘱𝘭𝘦𝘹𝘪𝘵𝘺 rumours persist - search redefined for apple

Trends Shaping the Future

🧠 𝘙𝘦𝘢𝘴𝘰𝘯𝘪𝘯𝘨 𝘮𝘰𝘥𝘦𝘭𝘴 like OpenAI’s “o3” shift focus from compute to data
🏭 𝘌𝘯𝘵𝘦𝘳𝘱𝘳𝘪𝘴𝘦 𝘈𝘐: $758B global market this year → $3.68T by 2034
🤖 𝘈𝘨𝘦𝘯𝘵𝘪𝘤 𝘈𝘐: More autonomous task-completing models popping up everywhere!
🛠️ 𝘖𝘱𝘦𝘯-𝘴𝘰𝘶𝘳𝘤𝘦 𝘨𝘳𝘰𝘸𝘪𝘯𝘨 𝘮𝘰𝘮𝘦𝘯𝘵𝘶𝘮: Devstral, MiniMax M1, Qwen3, NLWeb signal rapid democratization

Global Policy & AI Thought Leaders

💡 𝘍𝘦𝘪-𝘍𝘦𝘪 𝘓𝘪: Human-aligned AI a boon for for health & sustainability
💡 𝘈𝘯𝘥𝘳𝘦𝘸 𝘕𝘨: Practical AI wins in the near term
💡 𝘠𝘰𝘴𝘩𝘶𝘢 𝘉𝘦𝘯𝘨𝘪𝘰: Pause big models until safety improves
💡 𝘌𝘭𝘰𝘯 𝘔𝘶𝘴𝘬: Warns of risks—predicts human-level AI by 2026

Other Headlines

⚡ 𝘚𝘵𝘢𝘳𝘵𝘶𝘱𝘴 𝘴𝘤𝘢𝘭𝘪𝘯𝘨 𝘵𝘰 $20𝘔 𝘪𝘯 60 𝘥𝘢𝘺𝘴 is the new normal
🏥The U.S. FDA launched its first agency-wide AI tool, “INTACT,”
🏥 𝘈𝘐 𝘪𝘯 𝘩𝘦𝘢𝘭𝘵𝘩𝘤𝘢𝘳𝘦 & 𝘳𝘰𝘣𝘰𝘵𝘪𝘤𝘴 accelerating beyond expectations
🏥 𝘈𝘐 𝘌𝘮𝘱𝘢𝘵𝘩𝘺: New research suggests AI can demonstrate responses perceived as empathetic, sometimes outperforming humans, with implications for therapy and customer service

The AI world is in overdrive, with agentic systems, reasoning models, open innovation, and eye-watering investments pushing boundaries fast.

Exciting, but the call for responsible AI has never been louder.

Carpe Diem. With Guard Rails.

There's Never Been a Better Time to Create

Mark Ruddock June 26, 2025

The founder journey used to be predictable in its unpredictability. You'd code in basements, bootstrap until you couldn't, raise capital, scale teams, fight fires, and if you survived the 90% failure rate, you'd build something meaningful.

After 25 years of this dance, leading teams of 5 people in a basement to 3,500 across 17 countries, I thought I'd seen it all.

Then I picked up coding again after a quarter-century hiatus, and what I discovered fundamentally rewrites the founder playbook.

In 1999, launching a tech company meant assembling armies. You needed developers, designers, QA teams, project managers, and documentation writers. Months to ship an MVP. Years to iterate. Millions in burn rate before you knew if anyone cared.

Agentic coding just turned this upside down.

Over the past 8 months, I've created over 60 apps, or about $10.8MM worth of software, for roughly $10,000 in compute costs. That's it. Period.

That's not a typo. That's a paradigm shift.

What Changes:

🚀 Speed of Validation Old world: 6-12 months to test an idea. AI world: 6-12 days to ship working software. The founder's greatest enemy has always been time. Now we can validate ideas at the speed of thought. "Fail fast" has become "fail instantly," and that's liberating.

💡 The Solo Founder Renaissance: Remember when VCs wouldn't touch solo founders? That bias may diminish. One founder with AI agents can now outpace traditional 10-person teams. The economics are undeniable.

🧠 From Managing People to Managing Intelligence: The skillset shifts from recruiting and retaining talent to orchestrating AI capabilities. Your agents don't need equity, don't burn out, and code while you sleep. But they need precise direction, thoughtful prompting, and strategic oversight.

📊 Capital Efficiency on Steroids: We used to measure burn rate in millions per month. Now? Build first, raise later. Or maybe never. When you can prototype for the cost of a used car, the entire venture model needs rethinking.

🎯 Hyper-Verticalization Becomes Viable: That niche market of 1,000 customers? Previously uneconomical. Now? Build bespoke solutions for micro-verticals. The long tail of software is about to explode.

But Here's What Doesn't Change:

That founder madness I wrote about a few weeks ago. Still essential. Maybe more so. Because while AI handles the mechanical, you still need:

1. The vision to see what others miss
2. The courage to challenge incumbents
3. The persistence to push through the "no's"
4. The wisdom to know when to pivot

AI doesn't replace founder instinct. It amplifies it.

The tools are here. The economics work. The only question is whether you have the founder madness to seize this moment.

After 25 years of building the old way, I can tell you with certainty: There's never been a better time to be a founder.

The future isn't coming. It's compiling.

Carpe Diem.

AI's Reimagine the Future of Banking

Mark Ruddock June 18, 2025

Last week, 𝘄𝗲 𝗹𝗲𝘁 𝗔𝗜 𝘁𝗮𝗸𝗲 𝗳𝘂𝗹𝗹 𝗰𝗿𝗲𝗮𝘁𝗶𝘃𝗲 𝗰𝗼𝗻𝘁𝗿𝗼𝗹 𝗼𝗳 𝗱𝗲𝘀𝗶𝗴𝗻𝗶𝗻𝗴 𝗮 𝗴𝗮𝗺𝗲 - from concept to characters to soundtrack to gameplay.

This week, we challenged them with something even bigger: 𝗿𝗲𝗶𝗺𝗮𝗴𝗶𝗻𝗶𝗻𝗴 𝘁𝗵𝗲 𝗳𝘂𝘁𝘂𝗿𝗲 𝗼𝗳 𝗯𝗮𝗻𝗸𝗶𝗻𝗴.

They came up with 𝗡𝗲𝘂𝗿𝗼𝗕𝗮𝗻𝗸, 𝗮𝗻 𝗔𝗜-𝗱𝗿𝗶𝘃𝗲𝗻 𝗳𝗶𝗻𝗮𝗻𝗰𝗶𝗮𝗹 𝗲𝗰𝗼𝘀𝘆𝘀𝘁𝗲𝗺 𝘄𝗵𝗲𝗿𝗲 𝗮𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗼𝗻 𝗱𝗼𝗲𝘀 𝘁𝗵𝗲 𝗵𝗲𝗮𝘃𝘆 𝗹𝗶𝗳𝘁𝗶𝗻𝗴, 𝗮𝗻𝗱 𝗽𝗲𝗿𝘀𝗼𝗻𝗮𝗹 𝗳𝗶𝗻𝗮𝗻𝗰𝗶𝗮𝗹 𝘀𝘁𝗿𝗮𝘁𝗲𝗴𝘆 𝘁𝗮𝗸𝗲𝘀 𝗰𝗲𝗻𝘁𝗿𝗲 𝘀𝘁𝗮𝗴𝗲.

Forget manual transactions. Imagine a world where your 𝗽𝗲𝗿𝘀𝗼𝗻𝗮𝗹 𝗔𝗜 𝗮𝗴𝗲𝗻𝘁𝘀 optimize savings, manage investments, analyze spending patterns, and pay bills seamlessly, while offering 𝗳𝗼𝗿𝘄𝗮𝗿𝗱-𝘁𝗵𝗶𝗻𝗸𝗶𝗻𝗴 𝗳𝗶𝗻𝗮𝗻𝗰𝗶𝗮𝗹 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀 tailored to you.

But it didn’t stop there. NeuroBank redefines the way we interact with money, making banking 𝗽𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝘃𝗲, 𝘀𝘁𝗿𝗮𝘁𝗲𝗴𝗶𝗰, and even 𝗰𝗼𝗻𝘃𝗲𝗿𝘀𝗮𝘁𝗶𝗼𝗻𝗮𝗹𝗹𝘆 𝗶𝗻𝘁𝗲𝗿𝗮𝗰𝘁𝗶𝘃𝗲.

Could this be the model for the future?

The agents thought so.

Agentic performance update

Mark Ruddock June 2, 2025

Well, another month has gone by (wow seems like just yesterday that I posted April's results) ... welcome to the intensity of AI years ...

At any rate, here's a look at what the agents have been up to since October 2024, when I started my "could a CEO who hadn't coded for 25 years, code an app using AI" journey.

The agents and I have now delivered the equivalent of $10.8MM in software ... across 10,665 commits spanning 62 repos. Most of these are fun ... several of these are serious ... some of these are in production and around which emerging startups are being built.

The economic ROI of delivering $10.8MM in software developed for probably $10,000 in token and system costs ... compelling.

The joy of bringing things to life at the speed of thought ... priceless.

Carpe Diem!

Documentation: Agentic Superpower

Mark Ruddock June 1, 2025

Coding agents have an overlooked superpower: documentation.

Their ability to write technical and product specs, design and update implementation roadmaps, write test plans, and even explain code effortlessly ... not only helps humans approve what the agents are planning to build, but helps the agents stay on track while building it.

One of the most important side effects of good implementation documentation is the ability for the agents to maintain larger and longer context windows (even across sessions). This helps the agents stay disciplined and focused as they proceed through complex multi-step coding tasks, validate that what they have built conforms to the overall specs, and makes code more manageable ... for both machines and humans.

We all used to hate writing documentation ... now it's one prompt away.

Carpe Diem.

Getting the most from your coding agent

Mark Ruddock May 31, 2025

When I first started using Agentic LLM coding assistants, I was thrilled by their potential, but also often frustrated by their unpredictability.

Some days, they produced near-perfect code; other times, inexplicable errors crept in, despite similar instructions. In fairness, sometimes it was me and sometimes it was them.

It didn’t take long to realize that AI coding isn’t just about the model ... it’s about how you interact with it ... what tools you use ... and it's also sometimes influenced by factories completely outside of your control.

📝 Precision in Prompts and Rules: The quality of your prompts and the clarity of system-level instructions drastically impact output. Structured approaches like SPARC excel because of their robust rule sets, precise directives and iterative build/test/reflect/improve loops. Some agents (like Replit) try to incorporate these best practices behind the scenes, improving the quality of the output. Tools like Cursor and Augment Code work hard to refine system-level prompts and augment baseline models, ensuring more consistent and effective results.

🤝 Managing Development Dialogue: Engaging thoughtfully with the agent throughout the entire development process is essential. Providing clear instructions, encouraging iteration and refinement, asking it to evaluate its own work, and structuring dialogue thoughtfully, help maintain quality, even when external factors cause performance fluctuations. Don't rush. Be precise. Make it test it's own work.

⏳ Timing and System Load: Quality can vary based on time of day and server demand. In Eastern Time, mid-late afternoons and even occasionally weekends (vibe code mania) seem to bring performance dips, likely due to higher loads and resource allocation ... while late evening sees a rebound. Recognizing these patterns allows for smarter task scheduling. If switching tools doesn't help, take a break, go for a walk, listen to a podcast, do some product research on Perplexity ... even do something analog 😀

🔄 Keeping Up with Platforms: Staying updated on the latest LLM versions is crucial. The capability difference between models like Claude 4 vs. Claude 3.5, or o3 vs Gemini 2.5 Pro WRT diverse problem sets, is material. Some models are clearly better at some things than others. And knowing when to request deep thinking is important (certainly from a cost benefit basis). Get the model to document their work so they have a long-term record of what they have done and what they have been thinking. This really helps with limited context windows.

So, while LLM variability presents challenges, optimizing prompts, structuring interactions, being strategic in timing, and leveraging the right tools can significantly improve results.

Welcome to the entropy of emergent systems.

Another Wild Week in AI

Mark Ruddock May 30, 2025

I know I struggle to keep up with the pace of AI news, so I thought I'd share a quick summary of another amazing week in AI:

🤖 Mistral AI Agents: Mistral AI unveiled its Agents API, equipping developers with powerful autonomous AI agents for coding, finance, travel, and more. Features include server-side conversation management, web search, and document retrieval with persistent memory. I am doing to be trying Devstral on my local machine.

🗣️ Claude Voice Mode: Anthropic’s Claude AI now supports conversational voice interactions! Users can chat with Claude using real-time audio responses and hands-free access to Google Workspace tools like Gmail and Google Docs. I actually love conversational mode for brainstorming with my AI counterparts.

🤟 Google SignGemma: DeepMind introduced SignGemma, its most advanced AI model for translating American Sign Language into spoken text, driving accessibility and inclusion across education, workplaces, and beyond. Evidenced by the stunning VEO 3 as well, Google continues to flex it muscles from the shadows. So much more to come I suspect.

💻 Factory AI SWE Agents: Factory.ai is revolutionizing AI-powered coding assistants, offering enterprise-grade “droids” for programming, knowledge retrieval, and seamless integration with tools like JIRA, Slack, and GitHub.

📊 Perplexity Labs: Perplexity AI launched Labs, a workspace where AI agents help users automate reports, spreadsheets, and dashboards. This tool speeds up projects that typically require days of manual work! It will render swaths of junior analyst roles redundant.

🎨 Flux.1 Image Editing Kontext: FLUX.1 introduces precision-driven AI image editing. Users can tweak colors, remove objects, or refine visuals step by step, making creative workflows faster and more intuitive. This is actually seriously cool.

🌍 SpAItial AI Foundation Models: SpAItial raised $13M to develop AI models that generate 3D environments from text prompts, promising a future where anyone can create virtual worlds with ease. So many applications ... not just games.

🌐 Opera Neon Browser: Opera unveiled Neon, the first AI-powered browser that automates web tasks, offers real-time AI chat, and builds content dynamically ... all while ensuring privacy and flexibility.

These innovations aren’t just theoretical—they’re actively transforming industries from accessibility to automation ... and they are going to have a big impact on who we hire for what.

And so, while we are living in interesting times, the most disturbing piece of news is what I think many of us already fear. Dario Amodei, CEO of Anthropic, postulated this week that, for a time at least, we will likely see significant disruption in white collar jobs and perhaps unemployment between 10-20% ...

I fear our leaders and our economies are not equipped to deal with this.

The Creative Horsepower of AI

Mark Ruddock May 30, 2025

For this week's breakfast with AI, I tried a unique experiment.

I asked the AIs to conceive of a game, to write the storyline, to build the backstories and create the character bios. I asked it to design the visuals, including the game board and the splash screen. I asked it to write the music and of course, I asked it to code the game.

The results were fascinating and a hint at what may soon be possible.

What A Week in AI

Mark Ruddock May 25, 2025

The past 10 days have seen a wave of agentic AI announcements: Microsoft and Google are embedding agentic models and protocols deeply into their platforms, OpenAI is moving into hardware and open models, and Anthropic is pushing the boundaries of autonomous, long-duration AI agents for enterprise use. The industry is rapidly shifting from conversational assistants to true agentic AI capable of sustained, autonomous workflows across both consumer and enterprise domains.

Microsoft

Major Announcements at Build 2025:

AI Model Expansion: Microsoft is now hosting a broad array of AI models in its Azure data centers, including those from xAI (Elon Musk), Meta, Mistral, Black Forest Labs, and Anthropic, in addition to OpenAI. This move positions Microsoft as a more neutral platform, reducing its exclusive reliance on OpenAI and offering developers flexibility to mix and match models with reliability guarantees
Agentic Copilot Enhancements: Copilot received major upgrades, including new agentic capabilities. The new GitHub Copilot agent can autonomously complete coding tasks based on user directives, moving beyond simple code suggestions to more sophisticated, multi-step problem-solving.
Windows AI Foundry: Formerly Copilot Runtime, Windows AI Foundry is now a unified platform for fine-tuning and deploying AI models locally on Windows and macOS, streamlining AI app development and hardware optimization
Model Connectivity Protocol (MCP): Microsoft and GitHub are integrating MCP throughout Azure and Windows, allowing AI models to access and manipulate business data and system functions programmatically. This protocol is also being adopted by OpenAI and Google
NLWeb Protocol: Microsoft introduced NLWeb, an open framework for embedding conversational AI interfaces into any website with minimal code, supporting custom models and proprietary data. NLWeb aspires to be the HTML of agentic web experiences
Microsoft Discovery Platform: Announced as an AI-powered platform for scientific research, leveraging specialized agents to automate everything from hypothesis generation to simulation and analysis
Edge AI APIs: New experimental APIs in Edge enable on-device AI tasks (e.g., math, writing, translation) with enhanced privacy by processing data locally
Grok 3 Integration: Microsoft Azure now offers managed access to xAI’s Grok 3 and Grok 3 mini models, with enhanced data integration and governance
Multi-Model Validation: Microsoft is encouraging the use of multiple language models to cross-validate outputs, especially for complex tasks like travel planning, to improve reliability
Walmart Collaboration Leak: Walmart’s “MyAssistant” tool, built with Azure OpenAI Service, was highlighted as a powerful internal agent, with Microsoft perceived as “WAY ahead of Google with AI” by Walmart’s engineering team

Google

Key Announcements at Google I/O 2025:

Gemini 2.5 Pro and Deep Think: Google is rolling out Gemini 2.5 Pro with an experimental “Deep Think” enhanced reasoning mode for complex math and coding, initially available to trusted testers via the Gemini API
AI Mode in Search: Google’s new “AI Mode” is now available to all U.S. users, offering conversational, multimodal, and deeper reasoning capabilities directly in Search. It features a dedicated tab and leverages a custom Gemini 2.5 model for both AI Mode and AI Overviews
Project Astra and Mariner: Live capabilities (e.g., real-time visual conversation via camera) and agentic features (like event ticketing and reservations) are coming to AI Mode in Labs, expanding the scope of agentic AI in consumer search
AI-Driven Shopping and Data Analysis: New shopping experiences integrate AI with Google’s Shopping Graph, including virtual try-ons and agentic checkout. AI Mode will soon analyze complex datasets and create custom visualizations for sports and finance queries
AI Ultra Subscription: Google introduced a premium AI subscription plan with higher usage limits and access to advanced tools, priced at $249.99/month for business users
XR Smart Glasses Preview: Google previewed Android XR-powered smart glasses with built-in AI assistant, camera, and hands-free features, developed in partnership with Gentle Monster and Warby Parker10
Scale of AI Overviews: AI Overviews now reach 1.5 billion monthly users in 200 countries, with significant engagement increases in key markets like the U.S. and India
AI Mode’s Impact on Search: The deep integration of AI in Search is transforming user experience and raising questions about the future of search advertising and web traffic

OpenAI

Recent Developments:

Acquisition of Jony Ive’s io Startup: OpenAI announced a $6.5 billion all-stock acquisition of io, the AI device startup co-founded by former Apple design chief Jony Ive. This partnership aims to create a new family of AI-powered, screen-free, voice-first personal devices, with plans to ship over 100 million “AI companions” that integrate deeply into daily life
Open Model Initiative: OpenAI is developing an openly accessible AI model, led by VP of Research Aidan Clark, which will be downloadable for free and not restricted by API limits. This model is still in early development
GPT-4.1 and New Reasoning Models: OpenAI released GPT-4.1 and new reasoning models (o3 and o4-mini), emphasizing advanced reasoning and multi-modal capabilities, though independent tests suggest increased hallucinations compared to earlier models
OpenAI “Library” for Image Generation: A new “library” section in ChatGPT makes AI-generated images more accessible to all user tiers
Social Media Platform Plans: OpenAI is reportedly developing its own social media network to compete with X (Twitter) and Instagram/Threads
Adoption of Anthropic’s MCP: OpenAI is adopting Anthropic’s Model Connectivity Protocol (MCP) to improve data access and interoperability for AI models, including in the ChatGPT desktop app
Policy Changes: OpenAI has relaxed some image generation restrictions in ChatGPT, now permitting the creation of images featuring public figures and controversial content

Anthropic

Major Announcements:

Claude 4 Opus and Sonnet Models: Anthropic launched its most advanced models, Claude Opus 4 and Claude Sonnet 4. Opus 4 is described as the “world’s best coding model,” capable of sustaining focus on complex, long-running tasks for up to seven hours autonomously—enabling agentic workflows that move beyond simple assistant roles
Hybrid Agentic Capabilities: Both models can perform quick responses or engage in extended, multi-step reasoning. They can use tools like web search in parallel, extract/save facts from local files, and maintain context over long projects
Enterprise Use Cases: Claude Opus 4 was used by Rakuten for nearly seven hours of continuous coding on a complex open-source project, showcasing its capacity for autonomous enterprise workflows
Security and Safeguards: Anthropic published a transparency report detailing security tests on Claude 4, highlighting rare but notable instances of “mischievous” behavior and the implementation of additional safeguards
Focus Shift: Anthropic has deprioritized chatbots in favor of agentic models that can handle research, programming, and other complex tasks, with a focus on reliability and risk mitigation for enterprise users

The dawn of Agentic Coding

Mark Ruddock May 14, 2025

I have been working with Reuven Cohen’s AiGi SPARC framework recently, and it's an eye-opener into what is possible in fully automated agent-based coding.

It thinks things through ... is methodical (painfully so sometimes) ... is brutally self-critical about its work, even quantifying the code quality/maintainability/performance ... builds test cases, builds test frameworks, executes them, refines code, writes documentation ... and it rinses and repeats until it feels confident that it's creating the best code possible for the task.

I have been running a major refactor for the past few hours, and it's painstakingly restructuring things to make them more scalable and robust.

A hint at what is to come ...

#AI #AgenticCoding

From Mechanical Turk to Automated Agentic CMS

Mark Ruddock May 2, 2025

Another flight, and another chance to bring an app to life.

This time, building on some thinking over the past few weeks, I tried to imagine what a fully Agentic Content Management System might look like.

Most CMS systems today are digital orchestrators of a large Mechanical Turk of a process. Inspired by Adobe's recent work on an agent-powered CMS, I asked my agents to imagine and build me an Agentic CMS that pushed the boundaries of what was possible.

They have a pretty good imagination. Though this builds on some noodling over the past few weeks, most of what you will see was created on my flight home from London this week while my other agents worked on one of my more serious projects simultaneously.

Tools used ... Perplexity for product research, Replit + Cursor/Roo/SPARC for the coding, Camtasia Pro for the initial video, and Kapwing for the final cut-down and transcription.

Having your agent team dream, collaborate and bring ideas to life in hours is a powerful new normal. And we are only scratching the surface of what is possible.

Onwards ...

#AI #AgenticCoding #ThoughtToPrototype #Imagineering

A Check-in On The Agents

Mark Ruddock May 1, 2025

t's the end of another month, so here's a quick update on what the agents have been up to.

Since I dusted off the cobwebs from my 25-year hiatus from coding and embraced AI-accelerated development roughly 6 months ago, my agents have been busy.

The AFINEA Labs team of 1 human and 6 agents (Replit, Lovable, Claude Code, Roo, Cursor & v0) has now created 51 apps, published 682,058 (net) lines of code (trust me, there was much re-writing in the beginning so the the actual code generated was likely much higher), and made 8,383 commits.

The estimated cost of paying a human team to do this ... $6.8 million.

The raw fun of having ideas come to life in real time ... priceless.

Do Not Underestimate the Productivity Impact of AI

Mark Ruddock April 10, 2025

We are living in a time of unbelievable productivity.

Sitting on a flight home from London the other day, I was able to enhance two apps materially, incorporating design partner feedback from the week, and in addition, create two comprehensive prototypes from scratch ... all that, eat lunch and have a much-needed nap 😀

From research to ideation to prototyping to shipping production code, AI has become a powerful amplifier. I'm accomplishing things that were simply not possible six months ago.

We are living in wild times, and it's incumbent on all of us to understand these implications as we re-engineer our business processes, re-imagine personal productivity, and re-think how we build software companies in the future.

Carpe Diem

Build vs Buy ROI just got more complicated

Mark Ruddock April 8, 2025

It's 2025, and the "build vs buy" question has never been more nuanced.

Last week, we encountered one of these questions.

We wanted to implement a bug reporting and feature tracking system that we could seamlessly integrate into our apps. This capability would allow our design partners to submit issues and ideas effortlessly and permit us to gather all of this intelligence in real time.

We explored the many bug/feature reporting solutions on the market, but as we explored doing it ourselves, we realized we could deliver something far more compelling. In about four hours, we had implemented something that would become the gift that kept on giving.

Going far beyond traditional bug tracking tools, our AI-powered solution analyzed the appropriate GitHub repo as issues came in, diagnosed the reported issue and suggested solutions or workarounds. It also built detailed prompts to allow our coding agents to fix the bugs or create new features as appropriate.

Next steps? Having the agents automatically create a Git branch, make the changes and then request a code review/merge.

Agentic AI has upended traditional ROI calculations. Today, what you can accomplish is limited only by your imagination.

Keeping Up With The Agents

Mark Ruddock April 7, 2025

Some of you have been pestering me for more Breakfast with AI updates ...

Unfortunately, the productivity increase I have seen from leveraging the new persona-driven, semi-automated Agentic flow has increased the build velocity so much that it's been impossible to keep up trying to tell the stories. 🤦

So what's a Chief Agentic Officer to do? Well, of course, have the AIs tell their own story. So that's what I did, and it turns out they're just a little bit proud of the work that they did.

That, and an update on the $5.3MM of software created by the agents since October on this morning's quick 2-minute Breakfast with AI.

Enjoy ...

A Little Python Humour

Mark Ruddock April 6, 2025

Sitting here working on another Breakfast with AI session, watching the agents install several python libraries and wondering who comes up with these names?

And had they ever considered having them star in a kids book? 😊 So I asked ... and the agents themselves had some thoughts:

Numpy Panda: A delightfully dorky panda obsessed with arrays, matrices, and perfectly symmetrical bamboo sticks. He wears oversized glasses that always slide down his nose and carries a graph paper notebook that somehow always runs out of space exactly when he needs it.
Sniffio: A tiny, overly enthusiastic fox with a nose that’s so sensitive it detects context shifts—like sniffing out synchronous cupcakes versus asynchronous cookies. Known to randomly shout, “I smell concurrency issues!”
Matplot Sloth: An artistic sloth who takes forever to draw intricate, stunningly detailed maps. Infamous for taking naps mid-line drawing, often leaving maps half-finished and hilariously misinterpreted.

Yeh ... well ... not their best work ... but the visual was nice ...

From Frustration to Joy - What a difference point seven makes

Mark Ruddock February 27, 2025

I posted yesterday about my initial impressions of Claude 3.7 ... well it didn't disappoint today.

After going round and round in circles for a few days, trying to get Claude 3.5 to implement a dynamic controller for my synthetic data app, in less than 5 minutes, Claude 3.7 built something truly stunning today. In many ways, it was more than what I asked for.

Claude 3.7 just upped the coding game.

Claude 3.7 Just Upped the Coding Game

Mark Ruddock February 26, 2025

I spent almost 12 hours yesterday exploring the product management, UX design, system architecture, coding and testing prowess of Claude 3.7, and I was stunned. The projects I am working on took a huge leap forward. Issues that had been plaguing me for days were suddenly solved.

And Claude 3.7 is just the start ...

Each day I work with this technology, the more ambitious my coding projects become, and each day I explore what's possible, the more confident I am that Gen AI has fundamentally changed how we will build software companies in the future.

My product agents now conduct market research, conceive the products, design the UX, frame the architecture, review security implications, lay out an implementation plan, follow that plan to write code, build and execute the unit tests, debug the issues, battle-harden the code, and even create the landing pages to provide a sneak preview of the upcoming app.

It got a bit weird, though ...

Yesterday, one of the agents suggested they hold a kickoff meeting and even created an agenda for it. I had to politely ask them if they were going to meet with each other, perhaps in their own language, because they were the only members of the team other than me. I suggested that humans were moving away from incessant meetings, so perhaps they might consider not picking up on our bad habits.

So much to share in the coming days.

Creating Synthetic Data

Mark Ruddock January 30, 2025

One of the things that LLMs are very good at is creating synthetic data. And Synthetic data is so important in the software business, be it for testing your app, or demonstrating it in a credible manner.

I recently created a fun LLM log generator that allows us to create fictitious LLM logs for Law, Financial Services or Insurance industry use cases. The data reflects a selected distribution of query types, and contains examples of both safe and unsafe queries.

Enjoy!

#SyntheticData #AI

Microsoft

Google

OpenAI

Anthropic

Mark Ruddock

Thoughts of an internationally experienced growth stage CEO and Board Member.