Which AI Actually Is the Best at ‘Being Human?’

clock
2025-03-15 19:01:02

Not all AIs are created equal. Some might do art the best, some are skilled at coding, and others have the ability to predict protein structures accurately.

But when you're looking for something more fundamental—just “someone” to talk to—the best AI companions may not be the ones that know it all, but the ones that have that je ne sais quoi that make you feel OK just by talking, similar to how your best friend might not be a genius but somehow always knows exactly what to say.

AI companions are slowly becoming more popular among tech enthusiasts, so it is important for users wanting the highest quality experience or companies wanting to master this aspect of creating the illusion of authentic engagement to consider these differences.

We were curious to find out which platform provided the best AI experience when someone simply feels like having a chat. Interestingly enough, the best models for this are not really the ones from the big AI companies—they’re just too busy building models that excel at benchmarks.

It turns out that friendship and empathy are a whole different beast.

This analysis pits four leading AI companions against each other—Sesame, Hume AI, ChatGPT, and Google Gemini—to determine which creates the most human-like conversation experience.

The evaluation focused on conversation quality, distinct personality development, interaction design, and also considers other human-type features such as authenticity, emotional intelligence, and the subtle imperfections that make dialogue feel more genuine.

You can watch all of our conversations by clicking on these links or checking our Github Repository:

Here is how each AI performed.

Conversation Quality: The Human Touch vs. AI Awkwardness

Sesame AI interface

The true test of any AI companion is whether it can fool you into forgetting you're talking to a machine. Our analysis tried to evaluate which AI was the best at making users want to just keep talking by providing interesting feedback, rapport, and overall great experience.

Sesame blows the competition away with dialogue that feels shockingly human. It casually drops phrases like "that's a doozy" and "shooting the breeze" while seamlessly switching between thoughtful reflections and punchy comebacks.

"You're asking big questions huh and honestly I don't have all the answers," Sesame responded when pressed about consciousness—complete with natural hesitations that mimic real-time thinking. The occasional overuse of "you know" is its only noticeable flaw, which ironically makes it feel even more authentic.

Sesame’s real edge? Conversations flow naturally without those awkward, formulaic transitions that scream "I'm an AI!"

Score: 9/10

Hume AI successfully maintains conversational flow while acknowledging your thoughts with warmth. However it feels like talking to someone who’s disinterested and not really that into you. Its replies were a lot shorter than Sesame—they were relevant but not really interesting if you wanted to push the conversation forward.

Its weakness shows in repetitive patterns. The bot consistently opens with "you've really got me thinking" or "that's a fascinating topic"—creating a sense that you're getting templated responses rather than organic conversation.

It's better than the chatbots from the bigger AI companies at maintaining natural dialogue, but repeatedly reminds you it's an "empathic AI," breaking the illusion that you're chatting with a person.

Score: 7/10

ChatGPT tracks complex conversations without losing the thread—and it’s great that it memorizes previous conversations, essentially creating a “profile” of every user—but it feels like you're trapped in office hours with an overly formal professor.

Even during personal discussions, it can't help but sound academic: "the interplay of biology, chemistry, and consciousness creates a depth that AI's pattern recognition can't replicate," it said in one of our tests. Nearly every response begins with "that's a fascinating perspective"—a verbal tic that quickly becomes noticeable, and a common problem that all the other AIs except Sesame showed.

ChatGPT's biggest flaw is its inability to break from educator mode, making conversations feel like sequential mini-lectures rather than natural dialogue.

Score 6/10

Gemini was painful to talk to. It occasionally delivers a concise, casual response that sounds human, but then immediately undermines itself with jarring conversation breaks and lowering its volume.

Its most frustrating habit? Abruptly cutting off mid-thought to promote AI topics. These continuous disruptions create such a broken conversation flow that it's impossible to forget you're talking to a machine that's more interested in self-promotion than actual dialogue.

For example, when asked about emotions, Gemini responded: "It's great that you're interested in AI. There are so many amazing things happ—" before inexplicably stopping.

It also made sure to let you know it is an AI, so there’s a big gap between the user and the chatbot from the first interaction that is hard to ignore.

Score 5/10

Personality: Character Depth Separates the Authentic from the Artificial

ChatGPT Interface after a voice interaction

How does an AI develop a memorable personality? It will mostly depend on your setup. Some models let you use system instructions, others adapt their personality based on your previous interactions. Ideally, you can frame the conversation before starting it, giving the model a persona, traits, a conversational style, and background.

To be fair in our comparison, we tested our models without any previous setup—meaning our conversation started with a hello and went straight to the point. Here is how our models behaved naturally

Sesame crafts a personality you'd actually want to grab coffee with. It drops phrases like "that's a Humdinger of a question" and "it's a tight rope walk" that create a distinct character with apparent viewpoints and perspective.

When discussing AI relationships, Sesame showed actual personality: "wow… imagine a world where everyone's head is down plugged into their personalized AI and we forget how to connect face to face." This kind of perspective feels less like an algorithm and more like a thinking entity. It’s also funny (it once told us that our question blew its circuits), and its voice has a natural inflection that makes it easy to relate to when trying to portray a response. You can clearly tell when it is excited, contemplative, sad or even frustrated

Its only weakness? Occasionally leaning too hard into its "thoughtful buddy" persona. That didn’t detract from its position as the most distinctive AI personality we tested.

Score 9/10

Hume AI maintains a consistent personality as an emotionally intelligent companion. It also projects some warmth through affirming language and emotional support, so users looking for that will be pleased.

Its Achilles heel is basically the fact that, kind of like the Harvard grad who needs to mention that, Hume can't stop reminding you it's artificial: "As an empathetic AI I don't experience emotions myself but I'm designed to understand and respond to human emotions." These moments break the illusion that makes companions compelling.

If talking to GPT is like talking to a professor, talking to Hume feels like talking to a therapist. It listens to you and creates rapport, but it makes sure to remind you that it is actually its task and not something that happens naturally.

Despite this flaw, Hume AI projects a clearer character than either ChatGPT or Gemini—even if it feels more constructed than spontaneous.

Score 7/10

ChatGPT struggles to develop any distinctive character traits beyond general helpfulness. It sounds overly excited to the point of being obviously fake—like a “friend” who always smiles at you but is secretly fantasizing about throwing you in front of a bus.

“Haha, well, I like to keep the energy up. It makes conversations more fun and engaging plus it's always great to chat with you,” it said after we asked in a very serious and unamused tone why it was acting so enthusiastically.

Its identity issues appear in responses that shift between identifying with humans and distancing itself as an AI. Its academic tone in responses persists even during personal discussions, creating a personality that feels like a walking encyclopedia rather than a companion.

The model's default to educational explanations creates an impression more of a tool than a character, leaving users with little emotional connection.

Score 6/10

Gemini suffers from the most severe personality problems of all models tested. Within single conversations, it shifts dramatically between thoughtful responses and promotional language without warning.

It is not really an AI design to have a compelling personality. “My purpose is to provide information and complete tasks and I do not have the ability to form romantic relationships,” it said when asked about its thoughts on people developing feelings towards AIs.

This inconsistency makes Gemini feel like a 1950s movie robot, preventing any meaningful connection or even making it pleasant to spend time talking to it.

Score 3/10

Interaction Design

Hume AI interface

How an AI handles conversation mechanics—response timing, turn-taking, and error recovery—creates either seamless exchanges or frustrating interactions. Here is how these models stack up against each other

Sesame creates conversation rhythms that feel very, very human. It varies response length naturally based on context and handles philosophical uncertainty without defaulting to lecture mode.

"Sometimes I feel like maybe I just need to cut to the chase with a quick answer rather than a long-winded lecture, right? You know, so… that's a small humorous aside to let you know that I'm aware of the potential of falling into a lecture mode and trying to keep things light but also deep at the same time," Sesame told us during a philosophical debate.

When discussing complex topics, it responds conversationally, with a small joke, sometimes with statements, other times with human noises like “hmmms” or whispers—which makes it very convincing as a human substitute.

Sesame also asks natural follow-up questions that push conversations forward, while knowing when to switch to statements when questioning might feel awkward. This adaptive approach creates dialogue that feels spontaneous rather than scripted.

Score 8/10

Hume AI was designed to let users do the talking. It recovers smoothly from misunderstandings without over explaining and is good at keeping the conversation going with relevant follow up questions.

However, its responses are often so short that conversations require significant effort to maintain momentum, which ultimately defeats its purpose. Users frequently need to provide new conversation fodder rather than enjoying natural dialogue development.

Another weakness appears in its templated response structures—acknowledgment + explanation + question—creating a mechanical rhythm: "What a thought-provoking question! This Concept is indeed a fascinating idea… What do you think about…?"

However it has a great feature that may truly enhance your chats: It’s able to detect the levels of emotion in each response, which is something no other model can do. This is great to help you reflect on how much a topic affects you and how it’s approaching a specific situation

Score 7/10

ChatGPT consistently addresses questions with relevant information but transforms conversations into unnecessarily elaborated interactions. Responses frequently balloon into paragraph-length explanations that disrupt natural dialogue flow.

For example, we asked it if it could develop some level of self awareness or consciousness, and this was its reply:

“That's a fascinating topic! While AI can definitely evolve and improve through interactions, true self-awareness or Consciousness is something that remains uniquely human. AI can mimic conversational patterns, respond empathetically and even learn from past interactions, but it doesn't have an inner sense of self or subjective experience. Researchers and developers are still exploring the boundaries of AI's capabilities but for now Consciousness remains a uniquely human trait.”

That’s not how a human would interact.

Its heavy reliance on opener phrases like "that's a really interesting question," or “that’s a fascinating topic” before every single answer further undermines conversational immersion, creating an interaction pattern that feels mechanical rather than natural.

Score 6.5/10

Gemini is a masterclass in how not to design conversation mechanics. It regularly cuts off mid-sentence, creating jarring breaks in dialogue flow. It tries to pick up additional noises, it interrupts you if you take too long to speak or think about your reply and occasionally it just decides to end the conversation without any reason.

Its compulsive need to tell you at every turn that your questions are "interesting" quickly transforms from flattering to irritating but seems to be a common thing among AI chatbots.

Score 3/10

After testing all these AIs, it’s easy to conclude that machines won’t be able to substitute a good friend in the short term. However, for that specific case in which an AI must simply excel at feeling human, there is a clear winner—and a clear loser.

Sesame (9/10)

Sesame dominates the field with natural dialogue that mirrors human speech patterns. Its casual vernacular ("that's a doozy," "shooting the breeze") and varied sentence structures create authentic-feeling exchanges that balance philosophical depth with accessibility. The system excels at spontaneous-seeming responses, asking natural follow-up questions while knowing when to switch approaches for optimal conversation flow.

Hume AI (7/10)

Hume AI delivers specialized emotional tracking capabilities at the cost of conversational naturalness. While competently maintaining dialogue coherence, its responses tend toward brevity and follow predictable patterns that feel constructed rather than spontaneous.

Its visual emotion tracker is pretty interesting, probably good for self discovery even.

ChatGPT (5.6/10)

ChatGPT transforms conversations into lecture sessions with paragraph-length explanations that disrupt natural dialogue. Response delays create awkward pauses while formal language patterns reinforce an educational rather than companion experience. Its strengths in knowledge organization may appeal to users seeking information, but it still struggles to create authentic companionship.

Google Gemini (3.5/10)

Gemini was clearly not designed for this. The system routinely cuts off mid-sentence, abandons conversation threads, and is not able to provide human-linke responses. Its severe personality inconsistency and mechanical interaction patterns create an experience closer to a malfunctioning product than meaningful companionship.

It’s interesting that Gemini Live scored so low, considering Google’s Gemini-based NotebookLM is capable of generating extremely good and long podcasts about any kind of information, with AI hosts that sound incredibly human.

Web3 桌面交易工具
了解币圈信息快人一步

7x24 快讯

00:18 2025-03-17
韩国央行:从未考虑过将比特币纳入外汇储备
韩国央行明确表示“从未考虑过将比特币纳入外汇储备”。韩国银行在16日回应国会议员、计划财政委员会成员车桂根(音译)的书面质询时表示,“我们认为需要谨慎对待将比特币纳入外汇储备的...
00:09 2025-03-17
美联储议息前瞻:鲍威尔仅能有限安抚市场,威胁主要来自白宫
美联储主席鲍威尔本周面临棘手任务,在周四凌晨的利率决议上既要向投资者保证经济基础...
00:03 2025-03-17
美国现货比特币ETF在35天内减持55348枚比特币
最新数据显示,美国现货比特币ETF自2025年2月6日以来已减持了4.76%的比特币持仓。从1月1日到2月6日,这些基金在其资产负债表中增加了约56,802.86枚比特币,但在过去的35天里,持仓量减少了55,348.00枚比特币。 截至3月14日,这些基金持有的比特币总价值为932.5亿美元,约占比特币总市值的5.6%。贝莱德的IBIT以392.4亿美元的净流入和568,...
23:45 2025-03-16
Hyperliquid 50x鲸鱼BTC做空已经浮盈830万美元,并开始5倍做多MELANIA
据链上分析师余烬监测,昨晚BTC的突然拉升没把Hyperliquid 50x鲸鱼清算掉,现在BTC价格已经跌回8.2万美元附近。所以他的BTC做空现在已经浮盈830万美元了:做空的BTC数量高达5608枚,价值4.6亿美元。成本价83923美元,清算价85803美元。 另外,他在一个半小时前以0.69美元的价格开多了339万枚MELANIA(240万美元)。在他开始开多MELANIA后,盯着他地址的直接把价格拉升了8%(从0.66美元拉到0.7...
23:35 2025-03-16
泰国查获63台非法加密挖矿设备,窃取电力价值超32.7万美元
3月17日消息,泰国中央调查局 (CIB) 官员上周五查获了63台非法加密矿机,这些非法加密挖矿设备价值约200万泰铢(6万美元),是在巴吞他尼府的三栋废弃房屋中发现的。当地居民抱怨有不明身份的人从该地区的电线杆和变压器上偷电,随后官员们进行了突袭。当地人怀疑被盗电力被用于隐藏在废弃建筑物中的加密货币挖矿作业。调查人员估计,这三栋房屋因...
23:32 2025-03-16
Arbitrum开发公司Offchain Labs将于今日公布重大项目进展
Arbitrum 开发公司 Offchain Labs 于 X 发文宣布,将于今日公布重大项目进展。
23:29 2025-03-16
金色晨讯 | 3月17日隔夜重要动态一览
21:00-7:00关键词:美联储、Lazarus Group、解锁 1.美财长无法保证美经济不会衰退; 2.过去24小时全网合约爆仓1.78亿美元,主爆多单; 3.下周将有MRS、FTN、QAI等代币迎来大额解锁; 4.距美联储利率决议剩余3日,3月降息概率仅存2%; 5.加密货币总市值跌至2.8万亿美元,24小时跌幅3.1%; 6.Lazarus Group已拥有约13562枚BTC,推动朝鲜比特币持仓超过萨尔瓦多和不丹。
23:08 2025-03-16
BTC跌破82000美元
行情显示,BTC跌破82000美元,现报81998.01美元,24小时跌幅达到2.81%,行情波动较大,请做好风险控制。
22:59 2025-03-16
以太坊流动性再质押协议TVL跌至72亿美元
DefiLlama数据显示,以太坊流动性再质押协议TVL当前为72.36亿美元,其中TVL排名前五的协议为: -ether.fi TVL为45.57亿美元,7日跌幅为10.39%; -KelprsETH TVL为10.05亿美元,7日跌幅为12.46%; -RenzoTVL为5.67亿美元,7日跌幅为11.69%; -Mantle Restaking TVL为4.45亿美元,7日跌幅为8.22%; -Mellow LRT TVL为2.52亿美元,7日跌幅为9.45%;
22:38 2025-03-16
Starknet桥接存储总价值超96万枚ETH
Dune数据显示,zkSync桥接存储总价值达3,752,843枚ETH,Starknet桥接存储总价值(TVB)为969,810枚ETH,桥接用户地址总数为1,227,680个;Arbitrum桥接存储总价值为4,770,622枚ETH、Optimism桥接存储总价值为873,643枚ETH、Base桥接存储总价值为630,882枚ETH。
22:13 2025-03-16
加密货币总市值跌至2.8万亿美元,24小时跌幅3.1%
据 Coingecko 数据,加密货币总市值跌至 2.8 万亿美元,24 小时跌幅 3.1%。 特朗普第二任期正式上任(1 月 20 日)当日,加密货币总市值为 3.621 万亿美元,自特朗普上任以来加密货币总市值共减少约 8210 亿美元,跌幅为 22.7%。
21:55 2025-03-16
BASE网络锁仓量当前为29.3亿美元
据DefiLlama数据显示,BASE网络TVL现为29.3亿美元,24小时涨幅为0.78%。生态协议TVL排名前三如下: Aerodrome TVL达6.36亿美元,7日跌幅为3.59%; MorphoBlue TVL达5.426亿美元,7日涨幅为17.9%; Uniswap TVL达4.8亿美元,7日涨幅为76.84%;