Thought #15 - Two Weeks Back, Agents Forward

A week of time-saving tools, cautious optimism, and one inevitable AI film deal

Hi lovely humans,

The title confused me too - being honest ChatGPT wrote it. Two weeks back is a reference to the news that AI could save civil services two weeks a year!

An oddly question week from OpenAI, but lots of agents coming through from different companies.

Our Week in AI

A Small Reminder Not to Entirely Trust AI

I use AI a lot. But this week has been a good reminder that it doesn’t always work.

Firstly, I was using it for coding some upgrades to our platform (Claude specifically). I was not quite vibe coding, more using it to discuss some logic for some complex features and styling some new flows.

It worked really well (although it did make our platform look like ChatGPT - I have lots of thoughts on this but that’s another story).

Our platform looking all ChatGPT-styled

Then I got cocky, and kept chatting to it about something I wanted to tweak in the flow (something a bit complex). Long story short - it didn’t work. It was a mess, the logic was off and the code itself wasn’t great.

Moral of the story - don’t trust AI without checking and understanding everything.

Secondly, I have been testing using AI (ChatGPT specifically) to speed up this newsletter. We have a nice flow where the team posts news with our 1 sentence summary to a Slack channel. Then ChatGPT turns this into our usual newsletter order (AI New Releases, AI News, Not Quite News).

This week we’ve been a bit busy (it’s deadline time for students and June is event heavy for me), so we’ve not been quite as good at posting news. I tried to use the o3 model in ChatGPT to get the news for the past week.

It output new releases as a nice table, which I actually like. The links, however, were all 404s (didn’t exist). I checked, and all the new releases were real, it was just the links. But it made me doubt all the stories it gave me for the news and ended up taking me as much as it would have if I did it myself - sigh!

In defence of ChatGPT, while most of the links were oddly fabricated blogs on the company website, the link to the Bing blog is actually the top search result on Google so did exist at one point.

A reminder that some of the human bits of the process can speed things up (in terms of trust and actually adding some stability).

AI New Releases

Tool

What it is

Why it matters

Perplexity Labs

New Pro workspace that spins prompts into dashboards, spreadsheets and mini web-apps

Starts to blur the line between search engine and low-code assistant

Bing Video Creator (mobile)

Free five-second vertical video generation in the Bing app, powered by Sora

Sora’s first real-world consumer use - great for social content, less for film-makers

Mistral Agents API

Endpoint for building lightweight tool-using agents (search, code, web)

Cheaper alternative to OpenAI/Gemini agents - good for devs and tinkering teams

Gemini 2.5 “Deep Think” mode

A slower but more thoughtful reasoning mode for complex tasks

Early reports say it’s better for code and analysis, but still a bit unpredictable

ByteDance “Dolphin”

Open-source model for parsing document layout (tables, figures, structure)

Surprisingly handy if you're dealing with PDFs or scanned docs and want open tools

We’re finally getting used to Claude Sonnet 4.0 (always takes me awhile to understand the best way to prompt new models). And I was actually considering coughing up for the Claude Max - but the feature I wanted is now available on my $20 a month tier!

Big Last Minute News

Deep Research and Integrations available on Claude Pro - we haven’t had a chance to play just yet but we’re excited.

Claude Pro Announcement

AI News

  • Civil servants reclaim 26 minutes a day
    A UK Copilot pilot across 20,000 staff showed real time savings - mostly from summarising emails and notes. That’s nearly two working weeks a year, per person.

  • PwC AI Jobs Barometer
    AI-exposed roles have seen productivity growth rise from 7% to 27% since 2018, and the wage premium for AI-skilled roles now sits at 56%.

  • Graduate job squeeze
    AI is quietly reshaping the job market - NYT reports a rise in graduate unemployment, especially in computer science and finance.

  • McKinsey’s “Lilli” rollout
    75% of their 43,000 staff now use McKinsey’s internal GenAI assistant to build slides and proposals. Speeds things up - and keeps juniors focused elsewhere.

  • North Lincolnshire data campus
    A £5–9bn data campus could create 3,600 construction jobs per year - part of the UK’s plan to spread compute and skills beyond London.

  • DSIT’s “AI in Local Government” pilots
    A round-up of 12 AI pilots across UK councils - from AI helping process Freedom of Information requests to testing LLMs in casework triage. Gently optimistic.

  • Zapier now has more agents than employees
    Wade Foster (Zapier CEO) shared that they’re now using over 800 internal AI agents - more than the number of humans at the company. And they’ve shared some templates.

  • OpenAI’s vulnrability disclosure policy
    OpenAI’s new framework for responsibly reporting vulnerabilities found by its models - a small but important safety shift.

  • Yoshua Bengio’s “Honest AI”
    The deep learning pioneer launches a new non-profit focused on building systems that won’t lie - a guardrail move for agentic models.

  • AI in healthcare

  • Culture moment
    Luca Guadagnino will direct Artificial, a film about OpenAI’s five-day board saga.

Not Quite News, But Worth a Read (or listen or watch)

  • Bloomberg’s AI issue
    A calm look at what it means for every company to become an “AI company.” Some hype, some clarity - decent coffee read.

  • The Guardian – The OpenAI Empire
    A podcast episode from Today in Focus on Sam Altman, power, and where OpenAI’s influence might be heading. 26 minutes, well produced.

  • Exploding Topics – 7 AI Trends
    A slide-friendly roundup of where AI adoption is actually happening - healthcare, finance, sustainability. Hype kept to a minimum.

  • Why kids still need to learn to code
    A thoughtful piece from Raspberry Pi on why coding still matters in an AI world - not despite AI, but because of it.

LinkedIn AI Poll

Last week we asked about using Google’s AI. Unsurprisingly, people are not using it more than OpenAI. But most people have used it - we’re big fans of NotebookLM but don’t use it regularly.

Vote in this week’s poll - please!

This week we want to know if you’re using Perplexity!

Final Thoughts

As always we hope this was helpful!

Feel free to share this with anyone who might find it useful.

Next week, we have some more ChatGPT for learning and will be exploring Deep Research.

Laura
Always learning

PS we’re hiring a Full Stack Dev if you know anyone!