
Ever played a quick game with your favorite chatbot, asking for a random number between one and ten? Chances are, it offered up “7.” Ask again, and you might get a “3” or “4,” followed by an “8” or “9.” While not guaranteed every time, this pattern is surprisingly common and reveals a fascinating quirk of today’s most popular large language models (LLMs).
This isn’t a magic trick; it highlights a peculiar limitation. Many mainstream LLMs tend to fall into predictable patterns, often exhibiting far less creativity than users might expect. While this consistency can be beneficial for tasks like coding or straightforward research, it poses a significant hurdle for brainstorming sessions or open-ended creative endeavors, fostering a kind of digital groupthink.
The Predictability Problem: Why Our AI Thinks Alike
The phenomenon of LLM homogeneity is well-documented. A November 2023 research paper, titled “Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond),” brought this issue to light, exposing remarkable levels of repetition not only within individual LLMs but also across different models. Researchers found that when prompted with open-ended questions, various LLMs frequently converged on remarkably similar answers.
For instance, when 25 different LLMs were asked 50 times each to create a metaphor for “time,” a vast majority of the 1,250 responses were variations of “Time is a river” or “Time is a weaver.” This tendency for LLMs to generate high-probability, familiar responses means a wealth of potentially unique information often goes unserved. While the exact reasons aren’t fully clear, experts speculate it’s because most LLMs are trained in similar ways on similar data, leading them to produce similar outputs.
This lack of diversity isn’t confined to abstract metaphors. Ask mainstream models to name a car type, and you’re likely to get a Toyota or a Honda. Suggest band names, and you’ll encounter a prevalence of “glass,” “neon,” “velvet,” or “static” in the suggestions. Even taglines for a product like New Balance running shoes often yield nearly identical, safe responses like “Run your way” from different prominent LLMs.
Introducing Flint: Breaking the Mold of AI Creativity
Enter Springboards, an Australian startup that has developed a novel LLM called Flint, specifically engineered to break this cycle of predictability. Unlike its mainstream counterparts, Flint is trained to generate a much wider spectrum of responses to open-ended queries. Springboards co-founder and CEO Pip Bingemann aptly states, “Most language models are fighting hallucinations. We welcome them.”
Flint’s unique approach is evident in tests. While ChatGPT and Claude might consistently offer “7” for a random number, Flint can surprise with a “3.7916.” When asked for a car type, it might suggest a Ford F-150 instead of the usual Toyota or Honda. For a New Balance tagline, Flint proposed “Built to last, run to win,” a distinct alternative to the common “Run your way.”
This willingness to diverge makes Flint a powerful tool for creative professionals. Zoe Scaman, founder of Bodacious, has been testing Flint and praises its ability to “catapult me all over the place.” She describes a scenario where mainstream models offered conventional solutions for reinventing a finance company for youth, while Flint suggested a truly disruptive idea: rebranding the entire concept of wealth accumulation.
Beyond Temperature: How Flint Achieves True Variety
Achieving this level of creative divergence wasn’t as simple as tweaking a setting. While most LLMs have a “temperature” parameter to adjust randomness, increasing it often leads to incoherent or nonsensical outputs. Springboards co-founder and CTO Kieran Browne noted that maxing out the temperature on some models could result in responses that unpredictably switch from English to code mid-sentence.
Springboards realized that these blanket parameters were too blunt an instrument. Instead, they developed Flint, built on Qwen 3, an open-source model from Alibaba, to apply randomness intelligently. Flint is specifically trained to identify precise points in its output where greater variety is desirable, only boosting the randomness at those strategic junctures rather than across the entire response.
This targeted approach allows Flint to inject an “oddball” suggestion without sacrificing coherence. Maximilian Weigl, co-founder and chief strategy officer at Uncommon, a marketing firm, describes Flint as “more of an invitation to think wider.” His team uses Flint alongside other LLMs, recognizing that “you can’t really create something boundary-breaking with tools that pull you back to the average.”
The Future of AI: Choice and Creativity
While Flint offers a compelling alternative for stimulating creative thought, its developers acknowledge it’s still a prototype and may not always perform perfectly. Moreover, Weigl rightly points out that for many routine tasks, the “average” output from mainstream LLMs is perfectly sufficient. The goal isn’t to replace reliable AI but to offer a choice for when true novelty is required.
This innovation from Springboards addresses a growing concern about the potential for AI to lead to a homogenized, “gray, boring world” of ideas. The challenge for users is to engage critically with AI output, regardless of its source. As Weigl wisely cautions, “If I saw people on my team copy-pasting something from AI, I’d be like, ‘That’s not your job! Think, talk to other people, use your own voice.’”
Ultimately, Springboards aims to empower users with the option to seek diverse and unexpected ideas, particularly in fields like advertising and marketing. By providing tools like Flint, they hope to encourage exploration and prevent the machines from dictating the limits of our collective imagination, ensuring that the future of AI includes a rich tapestry of thought and creativity.
Source: MIT Tech Review – AI