Microsoft launched a deepfakes creator at the Ignite 2023 event

One of the more unexpected products to be launched from the Microsoft Ignite 2023 event is a tool that can create a photorealistic avatar of a person and animate that avatar saying things that shouldn’t be said. of man.

Called Azure AI Speech text to speech avatar, the new feature, available in public preview today, allows users to create videos of an avatar speaking by uploading images of a person they want the avatar to look like and write a script. Microsoft’s tool trains a model to run the animation, while a separate text-to-speech model — either prebuilt or trained with a human voice — “reads” the script aloud.

“With text to speech avatars, users can more effectively create video … wrote Microsoft in a blog post. “You can use an avatar to build conversational agents, virtual assistants, chatbots and more.”

Avatars can speak many languages. And, for chatbot scenarios, they can tap AI models like OpenAI’s GPT-3.5 to answer off-script questions from customers.

Now, there are countless ways such a tool can be abused – which Microsoft to its credit knows. (Similar avatar-generating tech from AI startup Synthesia is misused to create propaganda in Venezuela and FALSE news reports promoted by pro-China social media accounts.) Most Azure subscribers will only be able to access a prebuilt — not custom — avatar at launch; Custom avatars are currently a “limited access” capability available through registration only and “only for certain use cases,” Microsoft said.

But the feature raises many uncomfortable ethical questions.

One of the main sticking points in the recent SAG-AFTRA strike is the use of AI to create digital likenesses. The studios eventually agreed to pay the actors for their AI-created likenesses. But what about Microsoft and its customers?

I asked Microsoft about its position on companies that use likenesses of actors without, in the actors’ views, proper compensation or even notification. The company did not respond — nor did it say whether companies should label avatars as AI-generated, as the YouTube and a increasing numbers on other platforms.

Personal voice

Microsoft appears to have more guardrails around a related generative AI tool, personal voice, which was also launched at Ignite.

Personal voice, a new capability within Microsoft’s custom neural voice service, can replicate a user’s voice within seconds providing a one-minute speech sample as an audio prompt. Microsoft pitched it as a way to create personalized voice assistants, dub content into different languages ​​and create custom narratives for stories, audio books and podcasts.

To avoid potential legal headaches, Microsoft requires that users provide “express consent” in the form of a recorded statement before a customer can use a personal voice to synthesize the their voices. Access to the feature is gated behind a registration form for now, and customers must agree to use personal voice only in applications “where the voice does not read user-generated or open-source content.”

“Use of the voice model should remain within an application and the output should not be published or shared from within the application,” Microsoft wrote in a blog post. “(C) customers who meet limited access eligibility criteria maintain a control over the creation, access and use of voice models and their output (if applicable) dubbing for films, TV, video and audio for entertainment scenarios only.”

Microsoft did not respond to TechCrunch’s questions about how it will pay actors for their personal voice contributions — or whether it plans to implement any kind of watermarking tech so that AI-generated voices can be easily recognizable.

For more Microsoft Ignite 2023 coverage:

This story was originally published at 8am PT on Nov. 15 and updated at 3:30pm PT.

Leave a comment