The convergence of voice technology and computer vision is ushering in a new era of multimodal AI search, transforming how travellers discover, plan and interact with destinations through more intuitive and emotionally resonant interfaces. As platforms shift towards image and voice-led engagement, destinations are embracing tools such as AI avatars and visual search stand to deepen user connection, boost discoverability and future-proof their digital presence.
Travel inspiration has long been driven by visual aesthetics. The next evolution of AI search, driven by advancements in natural language processing and the convergence of voice technology and computer vision, will engage all our senses, transforming how we find and process information. Voice and visual AI search is rapidly gaining momentum, with major platforms, such as Microsoft Copilot Vision, now incorporating multimodal capabilities that allow users to search using natural speech and images rather than typed queries.
As adoption accelerates, ensuring AI systems can interpret and respond to these more human-centric input methods will bring competitive advantages in user engagement and gather valuable multi-dimensional data that text alone could never provide. Allowing users to engage in the most natural way, through voice and image, represents a new frontier in digital brand engagement. Yet, while our panel of destination experts agrees this new mechanism of search will be a long-term trend, its predicted level of disruption left a fairly even split between continued reliance on traditional chat-based search and this more immersive search process.
The future of travel booking now involves a far more intuitive experience. The doubling of average query length following the launch of Google's AI Mode signals a broad shift away from keyword searches to complex, conversational interactions as users increasingly expect AI to act as intelligent assistants that understand intent and support exploratory tasks. With the global multimodal AI market expected to grow at an annual rate of 35.8% until 2030, it is abundantly clear that searches will continue to become more seamless. This focus on natural and engaging multimodal AI user experiences will deliver more contextually rich results, which will be key to the continued improvement of accessibility in digital interactions.
Destinations and travel brands that offer AI-enabled travel planning or customer support tools must continue to invest in updating their established interfaces to ensure they remain aligned with changing visitor expectations and search behaviour. Landfolk's Dream Search 'Daisy' is just one example of how multimodal mechanisms of AI search can match text-based cues against a visual semantic model, helping travellers select the holiday home that best matches their travel preferences.
Similarly, KLM's 'Ask Atlas' uses empowering visuals as the cue for sparking travel interest. With imagery as the basis of understanding travel preferences, this enables KLM to recommend the top three destinations in their flight network that are likely to resonate with travellers, supported by detailed and personalised AI-generated itineraries. Nevertheless, while the method of input and output of information is changing, the core importance of the content itself remains paramount.
The shift to a more intuitive multimodal AI search process extends beyond searching websites for information. User behaviour is simultaneously gravitating towards visually rich video formats on social channels, leading to growing search volumes on these platforms. The natural evolution of this trend is the combination of voice search with imagery and video. This fusion offers a quicker and more engaging way for users to find and consume information. For content creators and marketers, the key challenge lies in ensuring that their content, particularly imagery and video, is appropriately tagged to surface effectively in response to voice search queries.
With smartphones now integrating innovative AI-enabled features such as Circle to Search and Google Lens, destination marketers must also understand how to adapt to this shifting search pattern preferred by Gen Z travellers. With more than 20 billion monthly visual queries on Google Lens, of which 20% are related to purchasing journeys, and over 200 million Android devices supporting Circle to Search, such features can no longer be ignored. At the same time, devices like the Samsung Galaxy S25 series now offer visual AI capabilities through Gemini Live, allowing users to ask questions about what they see on their screen or via camera access in real time, making devices feel more alive.
In giving AI assistants "eyes", the introduction of visual AI into everyday interactions further enhances user experience and is starting to build a closer connection between people and generative AI interfaces. AI chatbots are becoming sources of trust and companionship, with approximately 10% of the 40 million weekly user interactions spending half an hour engaging deeply with them. As virtual companions, AI is beginning to impact social fabrics. Younger audiences predominantly see chatbots as their friends, providing connections that feel safe, nonjudgmental and always available. This is a trend that brands need to carefully consider, as travellers may engage more naturally and on a deeper emotional level with these more interactive and human-centric AI interfaces, highlighting the importance of tone, boundaries and emotional intelligence in AI interactions.
At first glance, the above findings present a compelling reason for DMOs and leading travel brands to invest in MetaHuman technology and build AI avatars. From TUI's 'Lena' to Qatar Airways' 'Sama' and the German National Tourist Board's 'Emma', these AI-powered avatars are poised to play a significant role in assisting travellers. It's been reported that the use of AI avatars boosts visitor satisfaction by up to 30%. In engaging travellers in a more personable manner, AI avatars also play a strong role in promoting less-visited destinations through lifelike representation and compelling storytelling of fulfilling travel experiences in ways that promotional articles often can't convey. Yet, authenticity concerns derail some of the potential virtual influencers bring to the future of destination marketing.
On the other hand, the ability to create virtual personas based on cultural backgrounds and other specific attributes enables destinations to represent themselves digitally and interact in a manner that mirrors real-life conversations. This capability can significantly enhance the quality of human interaction and make these virtual travel planning tools more welcoming. In going beyond pure promotion and offering clear and accurate information through voice, text and image, MetaHuman agents, such as VisitVesterhavet's 'Ida', demonstrate how traditional chatbots can evolve into platforms that better support the digestibility of detailed trip planning content.
As a voice-enabled AI interface, the ability to add an extra layer of personality to conversational interactions with visitors could become a key differentiator in strengthening relationships with travellers. Offering rich and expressive communication capabilities, the use of MetaHumans could spur the importance of destinations developing their own sonic identity that matches local dialects and accents. This additional layer of audio engagement represents a step forward from the starting line in creating a truly distinctive interface that boldly acts as a destination ambassador.
Sonic branding is the art of creating a sound presence that cuts through the sea of similar brands and trending audio on social media platforms. Just like a logo, a memorable sonic identity makes a brand instantly recognisable. Sound taps directly into our emotions and even the subconscious, allowing brands to build stronger emotional connections and foster trust.
For over 15 years, voice has been heralded as the next dominant interface between humans and machines, with voice technology having transitioned from a novelty to an essential part of daily life for many. 81% of Americans now use voice technology on a daily or weekly basis, with 68% using this new search mechanism more frequently. Nevertheless, the rise of AI voice assistants in smart home technology, with adoption rates climbing dramatically from 6.9% of American households in 2015 to 30% in 2024, may have fuelled the newfound interest in voice AI applications.
The use of these new voice AI interfaces might simply be mirroring and competing against traditional smart home technology usage, such as Amazon's Alexa and Apple's Siri. While brain-machine interfaces remain a vision of the future, voice and visuals (alongside texting) are expected to lead the way in travellers' digital interactions. This strong adoption, with 58% of consumers willing to try brands that use voice interfaces and 69% maintaining their use after a positive experience, underscores the urgency of rebuilding apps and websites with AI-powered voice and multimodal interfaces.
Travel inspiration has long been driven by visual aesthetics. The next evolution of AI search, driven by advancements in natural language processing and the convergence of voice technology and computer vision, will engage all our senses, transforming how we find and process information. Voice and visual AI search is rapidly gaining momentum, with major platforms, such as Microsoft Copilot Vision, now incorporating multimodal capabilities that allow users to search using natural speech and images rather than typed queries.
As adoption accelerates, ensuring AI systems can interpret and respond to these more human-centric input methods will bring competitive advantages in user engagement and gather valuable multi-dimensional data that text alone could never provide. Allowing users to engage in the most natural way, through voice and image, represents a new frontier in digital brand engagement. Yet, while our panel of destination experts agrees this new mechanism of search will be a long-term trend, its predicted level of disruption left a fairly even split between continued reliance on traditional chat-based search and this more immersive search process.
The future of travel booking now involves a far more intuitive experience. The doubling of average query length following the launch of Google's AI Mode signals a broad shift away from keyword searches to complex, conversational interactions as users increasingly expect AI to act as intelligent assistants that understand intent and support exploratory tasks. With the global multimodal AI market expected to grow at an annual rate of 35.8% until 2030, it is abundantly clear that searches will continue to become more seamless. This focus on natural and engaging multimodal AI user experiences will deliver more contextually rich results, which will be key to the continued improvement of accessibility in digital interactions.
Destinations and travel brands that offer AI-enabled travel planning or customer support tools must continue to invest in updating their established interfaces to ensure they remain aligned with changing visitor expectations and search behaviour. Landfolk's Dream Search 'Daisy' is just one example of how multimodal mechanisms of AI search can match text-based cues against a visual semantic model, helping travellers select the holiday home that best matches their travel preferences.
Similarly, KLM's 'Ask Atlas' uses empowering visuals as the cue for sparking travel interest. With imagery as the basis of understanding travel preferences, this enables KLM to recommend the top three destinations in their flight network that are likely to resonate with travellers, supported by detailed and personalised AI-generated itineraries. Nevertheless, while the method of input and output of information is changing, the core importance of the content itself remains paramount.
The shift to a more intuitive multimodal AI search process extends beyond searching websites for information. User behaviour is simultaneously gravitating towards visually rich video formats on social channels, leading to growing search volumes on these platforms. The natural evolution of this trend is the combination of voice search with imagery and video. This fusion offers a quicker and more engaging way for users to find and consume information. For content creators and marketers, the key challenge lies in ensuring that their content, particularly imagery and video, is appropriately tagged to surface effectively in response to voice search queries.
With smartphones now integrating innovative AI-enabled features such as Circle to Search and Google Lens, destination marketers must also understand how to adapt to this shifting search pattern preferred by Gen Z travellers. With more than 20 billion monthly visual queries on Google Lens, of which 20% are related to purchasing journeys, and over 200 million Android devices supporting Circle to Search, such features can no longer be ignored. At the same time, devices like the Samsung Galaxy S25 series now offer visual AI capabilities through Gemini Live, allowing users to ask questions about what they see on their screen or via camera access in real time, making devices feel more alive.
In giving AI assistants "eyes", the introduction of visual AI into everyday interactions further enhances user experience and is starting to build a closer connection between people and generative AI interfaces. AI chatbots are becoming sources of trust and companionship, with approximately 10% of the 40 million weekly user interactions spending half an hour engaging deeply with them. As virtual companions, AI is beginning to impact social fabrics. Younger audiences predominantly see chatbots as their friends, providing connections that feel safe, nonjudgmental and always available. This is a trend that brands need to carefully consider, as travellers may engage more naturally and on a deeper emotional level with these more interactive and human-centric AI interfaces, highlighting the importance of tone, boundaries and emotional intelligence in AI interactions.
At first glance, the above findings present a compelling reason for DMOs and leading travel brands to invest in MetaHuman technology and build AI avatars. From TUI's 'Lena' to Qatar Airways' 'Sama' and the German National Tourist Board's 'Emma', these AI-powered avatars are poised to play a significant role in assisting travellers. It's been reported that the use of AI avatars boosts visitor satisfaction by up to 30%. In engaging travellers in a more personable manner, AI avatars also play a strong role in promoting less-visited destinations through lifelike representation and compelling storytelling of fulfilling travel experiences in ways that promotional articles often can't convey. Yet, authenticity concerns derail some of the potential virtual influencers bring to the future of destination marketing.
On the other hand, the ability to create virtual personas based on cultural backgrounds and other specific attributes enables destinations to represent themselves digitally and interact in a manner that mirrors real-life conversations. This capability can significantly enhance the quality of human interaction and make these virtual travel planning tools more welcoming. In going beyond pure promotion and offering clear and accurate information through voice, text and image, MetaHuman agents, such as VisitVesterhavet's 'Ida', demonstrate how traditional chatbots can evolve into platforms that better support the digestibility of detailed trip planning content.
As a voice-enabled AI interface, the ability to add an extra layer of personality to conversational interactions with visitors could become a key differentiator in strengthening relationships with travellers. Offering rich and expressive communication capabilities, the use of MetaHumans could spur the importance of destinations developing their own sonic identity that matches local dialects and accents. This additional layer of audio engagement represents a step forward from the starting line in creating a truly distinctive interface that boldly acts as a destination ambassador.
Sonic branding is the art of creating a sound presence that cuts through the sea of similar brands and trending audio on social media platforms. Just like a logo, a memorable sonic identity makes a brand instantly recognisable. Sound taps directly into our emotions and even the subconscious, allowing brands to build stronger emotional connections and foster trust.
For over 15 years, voice has been heralded as the next dominant interface between humans and machines, with voice technology having transitioned from a novelty to an essential part of daily life for many. 81% of Americans now use voice technology on a daily or weekly basis, with 68% using this new search mechanism more frequently. Nevertheless, the rise of AI voice assistants in smart home technology, with adoption rates climbing dramatically from 6.9% of American households in 2015 to 30% in 2024, may have fuelled the newfound interest in voice AI applications.
The use of these new voice AI interfaces might simply be mirroring and competing against traditional smart home technology usage, such as Amazon's Alexa and Apple's Siri. While brain-machine interfaces remain a vision of the future, voice and visuals (alongside texting) are expected to lead the way in travellers' digital interactions. This strong adoption, with 58% of consumers willing to try brands that use voice interfaces and 69% maintaining their use after a positive experience, underscores the urgency of rebuilding apps and websites with AI-powered voice and multimodal interfaces.