SoundHound AI, Inc. (SOUN) Earnings Call Transcript & Summary
December 4, 2024
Earnings Call Speaker Segments
Unknown Analyst
analystAll right. Thanks for joining us on day 2 of the UBS Tech and IT Conference. Honored to have Nitesh with us today. Welcome, Nitesh. Nitesh is the CFO of SoundHound, and we're going to jump right into it. So maybe for some of the investors that don't know the story, would you be able to just help us with a bit of an overview, a bit of a background about SoundHound.
Nitesh Sharan
executiveAbsolutely. Yes. Thanks for having me. SoundHound AI, we're a conversational and voice AI company. We've been in business for over 20 years now, been pioneers and really driven a lot of technological differentiation. We have a vision to voice enable the world with conversational intelligence. We believe if you think of how humans interact, it's primarily through conversation. That's how we get things done efficiently. Yet technology for the last 75 years has really been about keyboard input, touch type swipe on your smartphone. And we believe the next major inflection in how humans interact with technology will be increasingly through natural conversations and in particular, voice. So we've built our own proprietary software stack. We're a software business. We have 3 pillars to our business model. We voice-enable products and think of automotive cars, TVs, IoT devices. And that has historically been the biggest part of our business. Up until last year, it was almost 90% of our revenue stream, royalty income from, again, as cars get voice enabled. We're seeing a lot of growth in what we call pillar 2, which is voice-enabled services. So one of the major applications is restaurants and food ordering. That is -- think of a drive-thru application where instead of having a conversation with a human to order your cheeseburger, French fries and Coke, you can now talk to automation. And we actually have many locations that are live with major QSRs where performance is exceeding human capability and human accuracy. But the opportunity in voice-enabled services is much greater. We're doing now things across industries in health care, retail, insurance, financial services. And that has been now a major inflection of our growth, and that's probably for the near-term horizon where we're most excited. But the long-term vision is also to bring the voice-enabled services pillared together with voice-enabled products. And the third pillar that we call monetization. So if I just walk you through a couple of examples, you're driving into work and you have a voice-enabled Chrysler, you can just say, "Hey, Chrysler, I'd like coffee. Well, the car already knows where you're going and say, great, there's a Starbucks, 1 exit over, 5 minute detour or we can order your coffee at Peet's. It's right next to your office, which would you like?" You could place a transaction, and that's new type of streams that we're trying to build. Again, our ultimate vision is that we think the world is going to be more and more voice enabled. LLMs, generative AI have catalyzed this, and we're seeing massive growth and expansion across a number of these industries. So that's what we've been building.
Unknown Analyst
analystAwesome. Thanks for the intro. Maybe just to dive right in, you have some of the largest software companies in the space, maybe something like a Salesforce, which is saying, don't DIY your AI and kind of come to us. Through our checks and the work we've done on the research side at UBS, we have seen that a lot of large customers more broadly are doing some form of DIY in their own AI. So I wanted to throw it over to you. How do you kind of feel about some of your customers or maybe some of your potential customers trying to DIY their AI. What have you sort of seen and uncovered and how can you help those customers?
Nitesh Sharan
executiveYes. I'll speak to it specifically around voice AI, but I think this extends into the broader Agentic AI revolution that's going on right now. And I think they both apply similarly. First of all, I mean, I think, the use cases we're talking about are tremendous. We're really talking about disruption of the $10 trillion services space, in which case, there's not going to be one uniform answer to that question. But reality is what we've seen historically, our Co-Founder and CEO, Keyvan Mohajer, his pioneering vision that built this company has a line that I've always appreciated, which is voice AI is hard, conversational AI is complicated. It takes 10 years to develop and it takes you 3 years to realize it takes 10 years to develop. So a lot of companies we've seen come and go. I think in the restaurant space, one public example is McDonald's, who said, I'm going to try to build in-house this voice automation capability. They tried to acquire some companies and they subsequently spun it off to IBM in a partnership. And then just about 6 months ago, it said, hey, that's not really working as well. So they're trying in fits and starts of things. Generally, I do believe you're going to need a partner who understands the space really well. There will be a partnership. We're scaling it, I said, across automotive, restaurants, health care, financial services. And in different use cases, maybe there are aspects of this technology expansion that can be done, do it yourself. I think we're trying to develop platform and capabilities to enable people to do more and more themselves. I don't think a lot of applications that people are starting to dream about, let me just access OpenAI's API, and I can get things done are really going to be relevant or appropriate for especially enterprise-grade workflows because precision is really important. You can't accept 20%, 30% hallucination rates. And that's where really our technology comes to bear. Our platform is a voice engine that has agnostic arbitration with our own content domain partners where we can get the correct precise answers and also where appropriate integration with LLMs. We are the first company who went live in the automotive space with our partner, Stellantis in Europe with an integration with ChatGPT. That was with the premium DS brand and they scaled into dozens of brands. We are doing similar things in the restaurant space. So again, I think the answer is going to be an evolution. I don't -- I think there will be areas where do-it-yourself could be appropriate. But by and large, for the market opportunity, I think, they're going to require specialist companies like ourselves to help catalyze that.
Unknown Analyst
analystGot it. And maybe to start off with the autos industry, you mentioned autos most recently. There's some competitors with pretty deep pockets here. You have Apple on the CarPlay side and Google on the Android auto side. You also have some very large customers and big partnerships you've done. Maybe we could start there and kind of talk about where you see the competitive landscape and how you fit into the mix and some of the largest wins that you've been able to achieve.
Nitesh Sharan
executiveYes. I differentiate in the auto space kind of 2 different ecosystems of competitors. So there's the independent players of which we're one, and we've basically been disrupting the major incumbent for many years and stealing share effectively. And we're independent platform providers that have our own proprietary voice AI stack. And we have both technological advantages, but also business model advantages. And then I'll talk more about that. And then the second piece are the big tech players like you're talking about, Android Auto, Amazon, Apple. And by the way, our platform integrates and partners with the big tech as well. So there's a couple of differentiating factors. Number one, like I said, we think there's differentiation on a technological standpoint. We have benchmarks. We've had customers do benchmarks on elements such as sentence accuracy. So in the automotive application, in a car setting at standstill and all the way in highway noise with a lot of background noise, how does our performance compare to big tech. And we see differentiation of several percentage points improvement at standstill, and then it goes double digits, 20 percentage point differentiation at 60, 70 miles per hour. So we believe there's technological differentiation. In the product suite, there's also differentiation against what the big tech does. Big tech generally provide cloud capabilities. They don't have edge-based excluding cloud. They are starting to offer customizable wake words, but one of the major strategic challenges for automotive that they're realizing especially what we believe are the winning OEMs of the future is that the risk of customer disintermediation and brand disintermediation is a real risk, and they want to own that customer journey. How you share data matters a lot. How you manage and customize privacy controls and security controls. Those are things that we, as an independent player provide. We work in perfect concert with the OEMs, and we do things in a brand centricity that oftentimes is elusive with big tech. And then like I said, we're really bringing new business model streams. So I mentioned at the beginning, the 3-pillar architecture, the monetization play that we have, where now you're in a Hyundai and you want to pick up that coffee, we're building that ecosystem. So we're now bringing together restaurant chains and other appointments and reservations into the car where we have stats that people are spending more than an hour a day in their car. And by the way, I came over here in a Waymo. And you can now access things much more fluidly with voice and conversations. And that's an ecosystem, a new business model. And by the way, when we brought generative AI into the vehicle, we saw not only multifold increases in usage. We've also seen that the engagement patterns, the longer duration of the queries, the questions that are being asked are just opening completely new threads. And so that's what we're all trying to bring here that, again, we -- that's why we're getting a lot of traction with the automotive.
Unknown Analyst
analystGot it. Maybe to flip over to another vertical you've had some pretty good success with, the restaurant vertical. Maybe you could talk a bit about restaurants, how that differs from autos and some of the success and some of the wins you've had on the restaurant side.
Nitesh Sharan
executiveYes. So this year, we've seen a lot of growth in the restaurant side. That's been sort of one of the biggest growth engines. And you can think of this as -- over the last few years, we came out of the pandemic and there was a major labor shortage challenge for many of the QSRs, and they needed just capability to serve the demand that they had. And by the way, one of the more permanent things that have happened post pandemic is the need for convenience, drive-thrus, pickup, takeout that structurally has changed in the restaurant industry. Then you always had sort of cost pressures, low-margin businesses, but also with inflation and so forth, cost was an important element. So the value proposition was quite clear for the last several years. And what we're seeing this year is a greater inflection on the ability to drive revenue incrementally. So the AI never hesitates to upsell. The AI never gets tired. It's not too busy packaging food. So what we found in several of our QSR brands that 10%, 20% ticket price uplift. And so now the value proposition has shifted from just cost containment or cost mitigation to actually revenue growth. So we are seeing a lot of traction. This is both applications on the drive-thru side. We have -- one of our earlier partners that we've talked about was White Castle. Actually, locally here, both in Scottsdale and Tempe, there are live White Castles with our technology at play. So if you're around and you want to go try it out. And these are examples of stores that now have an AI assistant who can help with very busy times. You can calibrate it when maybe there's -- you need to slow down the flow of traffic because you need longer time for food preparation, so you can calibrate this. Our solutions are multimodal, meaning we have both visual elements as well. So now to upsell, you can just visually flash a picture of a Sunday. You don't even need to ask, would you like some dessert with that. And that also is driving some behavioral changes. And we're seeing great scale. This is one where there's definitely -- we're on the precipice of massive growth. There's a massive TAM. Just in the U.S. alone, we characterize this as a $1 billion-plus annual revenue opportunity for us. We're in dozens of languages globally because we started getting most of our commercialization in the automotive space. So it's a global opportunity that's multifold of just the U.S. one. But just in the drive-thru application alone, it's that big. And then we're also doing a lot of food phone ordering capabilities. One of the areas we're really excited is in the pizza calibration. So we now have 3 of, I think, the top 4 or 5 pizza franchises, QSRs as partners. And one of the biggest areas where people are still phoning in, whether it's Sunday during football or generally for dinner, it's calling in for pizza. So these are areas where we're getting subscription-based revenue, recurring revenue from a business model standpoint, that's attractive for us also, shifting from royalty-based streams to subscription streams. And as I said, it's the opportunity to also intersect the product side with the restaurant side. And as I mentioned, even more other opportunities. So there's a lot that we're moving on.
Unknown Analyst
analystOne example that stuck with me, we were chatting before you got on stage. It was one of your customers in the Midwest, and they were utilizing their AI assistant to kind of help in the front of house while the employees they were able to work in the back of house and prepare the food. Maybe we could elaborate on that, and it comes from the genesis of a lot of questions, a lot of investor questions we get about AI replacing human jobs or augmenting and AI upselling versus kind of taking over the entire spend of a software provider. So I'd kind of love to flip the question to you about how you see, again, you talked about upsell and cross-sell versus revenue generation. But maybe kind of touch on that example and how your technology was used and able to help this customer of yours really get through a busy period when they had an inflow of customers.
Nitesh Sharan
executiveYes. We had a great example. Last year when the -- I think it's called The Eras Tour, Taylor Swift was going through, it was actually in the St. Louis, where we had just recently opened a new location with our voice capabilities. And it was -- I think it was a Friday night that throngs of Swifties came out and just kind of overwhelmed the store that they only had 2 or 3 employees that were there. And the subsequent feedback we got from those employees was they call their AI system Julia, that Julia was just a true partner that helped us with all the process flow to get -- to just handle all the orders that they could focus on food preparation and the customer activity. So absolutely, in that case, it was a real supportive element on the order taking process, and it absolutely felt like a supplementary employee to the folks in the store. But I do want to extend the solutions that we're offering are not only the order taking. We're actually doing a lot of things on employee productivity basis. So sometimes there are major QSRs who are hesitating to put the AI engine at the front end in the customer-facing side, but they absolutely are -- especially with unionization becoming greater and greater pressure for many QSRs, they're looking for support and employee productivity. So we can now very seamlessly ingest, for example, operating manuals or employee handbooks into an LLM, enable employees to access it. And if you needed to quickly learn how to make a specific beverage or how do you clean the specific machinery or you need to go in the back and pick up -- one of the challenges a lot of these retail locations have is going into the back for inventory restocking and getting -- picking up the silverware or replacement milk or whatever it might be, you can actually drive efficiency there through our AI system. So our solutions just in the restaurant space alone, and it's much broader than that are not only the order taking, it's actually a lot of employee productivity and many, many locations and restaurants are willing to go there first before they're willing to engage the AI with the customer side.
Unknown Analyst
analystRight. That makes sense. You recently acquired a company called Amelia. Maybe you could help us understand what was so special about this acquisition? Was it the people? Was it the technology? And how does this help you propel the journey forward?
Nitesh Sharan
executiveYes. So I want to go back to sort of what we're about. We're really about voice enabling the world through conversational intelligence, and we think there's a massive TAM across a number of different ecosystems and industries. So while for us, we characterize like our entry into restaurants is sort of stage 1 of a long-term journey of expanding across industries, kind of akin to how Amazon entered into books, but that wasn't their long-term vision, right? They want to do e-commerce for all. So for us, that's how we thought of restaurants. We have a lot of opportunity within restaurants, but really, we built over 15-plus years this voice AI engine that can apply equivalently into other spaces. So I'd say, number one, what was interesting with Amelia is they had already deep relationships and customers in health care, financial services, insurance, enterprise-grade technology and partnerships that for us, frankly, to enter into the door, build solutions for it would have been several years, but there was an acceleration of the customer journey there. They do have really interesting technology, really top Magic Quadrant leader type technology on the conversational AI. It was very complementary to us. So there was really 2 elements to the investment thesis. One was around customer traction and opportunity with the customer. And then number two is the state of the business. And on the first one, we had many conversations where many large money center banks, for example, really saw voice as the next horizon that they were trying to get into. And Amelia had to third party to others what we knew we could bring in and provide better performance at a better cost and accelerate the journey to help scale with these customers. So as one example, in the health care domain now, we can, not only provide a consumer journey on food ordering and for an automotive customer, they can check their results on a scan, they can get appointments, they can reorder their contact lenses. Like these are things now that we have been able to scale quickly into our voice engine that Amelia brought and accelerated our pathway. So one of many reasons, they have a great team. There's a lot of diversification. They have a strong group in India that we can leverage. So for multiple reasons, it was a great asset and...
Unknown Analyst
analystI'd love to go to the trust curve. It's something that everyone in this room, everyone listening in the webcast reads about online, whether it's in the contact center and maybe an airline having a model hallucination. Whether it's just maybe some of the consumer-based models having a hallucination and memes getting posted online. Customers see this too, and they're trying to make their decision of whether they go with an AI solution maybe or wait a little bit longer. Can you talk a little bit about your conversations with customers and where we are in this trust curve and how you approach the sales process and the go-to-market process and just educating customers and kind of showing them what's possible with your technology?
Nitesh Sharan
executiveYes. We think of it this way. For major inflections to happen, you need 2 things to come together. I think this will address the trust curve question, is the technology ready and is the customer adoption there. And we absolutely believe we found that intersection. And if I take the voice engines of the past, like if you think of your traditional interactive voice response systems where you call into Comcast and you have to press 1 for this, press 2 for that, it takes you forever, everybody is screaming operator, operator, get me out of here. That is a place of frustration for many, many customers, and the technology is now ready to displace that. So I think there's a little bit of disillusion around the legacy technology that now the technology is ready, and we're seeing a lot of momentum around customers ready to move to the new generation on that. Then there's actually new pathways to your question on like trust for -- even just within the last 2 years, we've seen massive shifts of people kind of going preference for humans and saying, if you get the automation, you have an immediate reaction like, let me talk to a human. That has absolutely shifted to preference in some cases of like, okay, I get -- somebody picks up the phone right away. They actually can handle the multitude of ways that I'm actually asking questions. There's not a lot of like in the maybe 10 years ago, generation of voice assistant that even today, I would not to be too disparaging, but the Alexa in your kitchen still is sort of limited in utility. You're now finding that, that utility base has expanded quite dramatically, and that's allowing humans to want to interact with technology at a greater pace. So it's not only a trust factor, but it's also just I'm willing to engage. I mentioned that in the auto side that we're seeing queries extend much longer. The types of queries that people are going after are much more varied. And then I think the biggest point on your trust factor is that our differentiation isn't simply that we go and try to intersect with an open AI API because then you do have the risk of hallucination and sort of made up facts. We actually know in specific -- if you go into a drive-thru and you order a cheeseburger, French fries and Coke, you are pretty precise on what you want to come out of that, and that's why we measure order accuracy. And as I -- we saw post pandemic that there were some actual surveys out there that suggested that humans themselves were only 85% accurate. You sometimes order things and you wouldn't get exactly, you get onion rings instead of French fries. But we've now been live in certain locations where out of the gate, week 1, we're at 85%. And after we work with them for months, we're at 90%, 95%. So in terms of the precision and the trust, we're actually finding the AI is more accurate in some cases than humans. And I think that's causing inflection and even greater sort of trust around the platforms.
Unknown Analyst
analystGot it. I did want to mention there's QR codes on your desk. So if you do want to ask a question, feel free to scan it and we can ask it up here. Nitesh, we're at the AI conference. We talked a lot about AI, but your job as the CFO is also quite important, and we do have investors in front of us. On the cost side of the equation, when we adjust for acquisition, your gross margin is about 60%. And then as we move down to the operating margin, you're still working towards profitability. So I would love to get a little bit of an update about how you think about balancing growth and the opportunity ahead of you versus costs and profitability and gross margin going up over time?
Nitesh Sharan
executiveYes. We've been a disruptive technology company growing for many years, and a lot of our footprint of investment historically has been on R&D, and that's how we've built our proprietary engine, differentiated competitive moat in voice AI. We have now tried to commercialize. So we've been investing in go-to-market. We've been doing that both direct and indirect. But we have been in investment mode historically. And I have made now -- in our last earnings call, I made update to our outlook that we would be adjusted EBITDA positive at the end of next year. So we are moving down that path, and we are very cost cognizant to make sure that we're being very edited with our investments. There's a ton of opportunity. Like I said, our ultimate vision is we think many new pathways are going to be voice enabled. They're going to be -- we're going to completely shift how we interact with technology, but we need to be edited because we have limited resources. So we're absolutely looking at all -- every element of our cost structure. We are a smaller, earlier-stage public company. So just with that comes some infrastructure costs, SOX compliance in my world, right, the legal, the regulatory compliance that scales very well with new commercial streams as we move from industry to industry, I don't have to double the team of my SEC reporting group. So we know that leverage will be there. As we get to that scale, we guided also revenue targets of $155 million to $175 million. And so at that level and it's continuing to grow, we've been historically growing over 50% CAGR. Last quarter, we grew almost 90% year-over-year. So as we continue to drive that distinction, what I've characterized is that we will get towards breakeven zone. At scale, many -- I think several years down the road, we do believe as a software company, we should be returning 30-plus percent EBIT margin businesses. But I would expect for the near and medium term that we're going to be more in that breakeven zone where, yes, we're not in the heavy loss-making zone. We're moving to breakeven. But I don't think we will necessarily be in a massive rush to get to that 30% EBIT margin because the incremental opportunity is so tremendous. And we want to continue to invest in those growth opportunities across new vectors, across new multimodal channels and go deeper into different languages and so forth.
Unknown Analyst
analystWe're starting to hear from some of your software peers about an uptick in the macro and even some green shoots across some of the recent October quarter end earnings prints. Are you seeing that as well? Are you seeing budgets loosen up? You're obviously in a state of higher growth and untapped TAM. But are you starting to see budgets loosen up a little bit? Or is it still kind of tough out there?
Nitesh Sharan
executiveNo, I think it depends on the vertical. So to be fair, in our automotive space, I think there's some still continuous sort of movement. And I think there's opportunity, and we're the disruptor that's penetrating and we're providing differentiation, but some of the customers are going through their own dynamics. I think if you go into the health care space, the answer is different than in the financial services space, is different in the restaurant space. Restaurants are still pressured with cost containment, and they are looking for ways to service this new mega trend around convenience and footprints to pick up and take home and all that kind of thing. So I do believe uniformly, people are really curious about this AI space and the conversational AI space and the Agentic AI space. So we are getting a ton of conversations. I mentioned publicly previously that in the restaurant space, if I were to compare and contrast now versus just even a year ago, we were going out trying to acquire more and more customers. And now we're having the silver line challenge, but something we don't take for granted that there's a lot more demand, and we're just trying to have to calibrate that. We don't want to go to a major QSR and say, can you wait for a year? Like that's not how it works. So I do think there's been, I'll just call it postelection, a lot of optimism around the regulatory environment and opportunities to invest and we are starting to see that. Still pretty early. But I mean, we're very bullish. We're aggressively going after the market opportunity. We do think there are major shifts and opportunities in how consumer journeys are going to happen, and we're driving a lot of it. So I mean, we're net-net really bullish on the opportunity in front of us.
Unknown Analyst
analystGot it. Super helpful. On the cost of compute and the status of being able to acquire GPUs, it's something that maybe has died down a little bit, but I wanted to ask you, how are you seeing the cost of compute coming down impacting your business, your customers? And are you able to get your hands on GPUs, enough GPUs? Do you need the latest and greatest from NVIDIA? Or is that not necessary for you right now?
Nitesh Sharan
executiveWe have what we need, and we work with Oracle, our sort of primary cloud provider that we get access through -- there's 2 parts to the answer. There's the training part and then there's the inference part. The inference part, the costs are coming down, and that's a tailwind for us. And for a lot of the use cases that we're going after, the vertical applied AI space, you don't need to build the next trillion parameter model, and we're not even competing in that space with OpenAI and others, right? We partner with them. Our differentiation is that we can sit above and orchestrate and arbitrate with where an LLM makes sense, and we partner with OpenAI. We partner with Perplexity. We integrate and work with Llama's models, Meta's models. So we can leverage the development that's happening. We certainly have been through our own journey of accessing GPUs and so forth. And I think we have sufficient capacity to drive what we're doing. We are innovating on foundation model we've talked about publicly, Polaris, which is a multimodal, multilingual architecture on the speech recognition engine that we're driving differentiation, and we're continuing to innovate, and it will always be a core to our thesis. But as I said on the inference side, we are seeing cost benefits also as we're starting to deploy these real-time use cases in enterprises in live customer environments. So net-net, our gross margin, because you mentioned the 60% gross margin, like I've said, we've historically been and we believe we're going to get to 70-plus percent gross margin. We've acquired some companies that have certain capabilities, call center capabilities in particular, that are more human-centric. That are great in terms of data access, allows us to convert these enterprise-grade data into better training for our models to build better algorithms to ultimately serve new use cases. That's the journey we're on. So as long as that's still part of the thesis, we're willing to take a temporary sort of, call it, lowering of the gross margin to build the top line and the positive gross margin accretion over time.
Unknown Analyst
analystRight. I think we have time for one more. What do you want to leave investors with as we close down today? What's some of the most -- maybe the most misunderstood part of the SoundHound story?
Nitesh Sharan
executiveWell, I think we're a new growing disruptor. So I think we're still getting to know many folks. And I think as you get to know us, hopefully, you'll see that we've been a public company now for just a couple of years. We've been delivering on what we said we would deliver on. There is a massive opportunity. I think it's really underestimating currently the long-term opportunity of what this AI revolution is really driving. And if you unpack what it really means, what it's finally coming out of the transformer architecture of 2017 in these large language models, it is natural language conversations. We believe voice AI is the killer app. So our heritage and our distinction on the technological side of working in voice AI, we think, is a differentiator. We think it gives us a running start. We don't take for granted. It's a very attractive market. There's going to be newer and newer players coming in. But we're also playing it aggressively. I think hopefully, people have seen this year that not only we delivered, but we've shown we can acquire, ingest and scale with acquisitions. We're doing -- being thoughtful and prudent on the capital side as well as towards your question earlier on getting to profitability. And ultimately, like we're just at the precipice of what we believe is going to be massive transformation in every new horizon and generation of 15-year shifts that happen in technology, we believe we'll be one of the new major players in this next horizon of Gen AI. So I hope that's what people think about when they think of us.
Unknown Analyst
analystGot it. Amazing. Thanks for tuning in to day 2 of the UBS Tech and AI Conference. I think we'll wrap it there.
Nitesh Sharan
executiveThank you.
This call discussed
For developers and AI pipelines
Programmatic access to SoundHound AI, Inc. earnings transcripts and 32,000+ others is available through the
EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments,
full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.