Issue #6: Dimensional Dispatch

Howdy readers!

First, to follow up on a few things that we touched on in our last issue and in case you missed it, Unity actually came out to apologize for the new runtime fee policy and acknowledged the need for more community involvement. The Sphere also debuted their first live performance with U2 and concertgoers are sharing some of the most insane AV visuals I've personally ever seen. It does beg the question — first iTunes, now The Sphere? Can someone please explain the influence of U2 and why they're always able to get a front row seat to every technological advancement since sliced bread? Genuinely curious :-) 

Six months after an open letter called for a pause in development of powerful AI language models, there have been some successes but no meaningful regulation or industry-wide pause. While the conversation around AI safety risks has been normalized, tech companies continue developing new systems without restraint. Especially in the last few weeks (as you'll read about below), it feels like a whole year's worth of groundbreaking research is announced on an almost daily basis. Slow your roll, people! 

One thing that caught my attention this week was Meta's launch of AI chatbots. If you haven't heard, celebrities like Kendall Jenner, Charli D'Amelio, Dwayne Wade, Mr Beast, and Snoop Dogg have all partnered with Meta to create lifelike alter-AI personalities that you can chat with across Meta-owned apps like Whatsapp, Instagram and Messenger. The news of these AI chatbots was first revealed during the company's Connect event last month. Weird thing is, Kendall Jenner is not actually Kendall. Her likeness is used for Billie, a "big sister" who is capable of giving advice. I just requested access yesterday so can't speak to how it actually feels to interact with them but Jules Terpak gives a pretty great (and creepy) walkthrough of this new feature here.

ChatGPT can hear and speak to you

OpenAI has begun rolling out new voice and image capabilities in their ChatGPT conversational model. These additions allow users to have voice conversations with ChatGPT or show it images to discuss. The capabilities are being gradually introduced to provide more intuitive ways to interact with ChatGPT while prioritizing safety as these features present new risks. From what we've heard (literally!) these new capabilities can expand the way users interact with ChatGPT models by ten-fold. The voices used are a far cry from what we're used to with Siri and Alexa and may even give Uncle Rabbit a run for his money.

RealFill completes images with what should have been there

Credit: @natanielruizg

RealFill is a new approach for image completion that is able to fill in missing regions of an image with content that should have been there based on a few reference images of the same scene. It uses a generative model personalized to the reference images to complete target images, even when there are large differences in viewpoint, lighting, etc. between the references and target. The personalized model maintains an image prior while also learning the scene contents and style from the references to generate completions for targets that are both high quality and faithful to the original scene depicted in the references. Read the full paper here.

Gaussian Splatting for all

Both Polycam and Luma have both joined the Gaussian Splatting race in the past few weeks! Polycam came out with their announcement two weeks ago and Luma followed very shortly after (last week) with the ability to re-render your NeRFs as Gaussian Splats using your old captures. I have yet to test out some side-by-sides myself (both between the Polycam and Luma processing techniques) or even NeRFs vs. Gaussian Splats with my own captures but I can at least for now point you to the right resources:

  • Jonathan Stephens processes a capture from his iPhone 14 pro on Luma's new Gaussian Splatting feature here.
  • Bilawal Sidhu processes drone footage of Las Vegas with Luma's new feature here.
  • Marvin Rosario compares Luma with Polycam here.

Runway partners with Canva

Credit: Canva

Runway is one of the most exciting companies out there today operating from an AI-first approach when it comes to making creative tools. This exciting partnership will allow Canva's 150 million monthly users to access Runway's AI video generation model Gen-2 directly in Canva through a new Magic Media app. Up until now, only a small handful of creators experimenting with AI were using Runway, this expands access to cutting-edge AI models and gives them new ways to bring their visual ideas to life with video. More on that news here.

Somebody's Watching Me 🎶

Credit: Rewind

In what appears to be a painfully un-ironic episode of Black Mirror, several companies are racing to develop new forms of wearable AI devices that could threaten to challenge the smartphone's dominance: smart glasses by Meta x Raybanpendants from Rewind and pins from Humane. While all different form factors, they all offer voice-based AI access without the need for a phone. Before you get too excited, it there has been real cause for concern here with many of these resembling everyday accessories like jewelry, raising obvious privacy concerns as these devices often record audio without making totally clear when. While proponents argue they offer more natural interactions than phones, critics warn they risk retaining too much private context from users' daily lives without sufficient transparency or consent.

While most of these companies have on paper suggested their wearables all have a privacy-first approach, it's obviously hard to ignore the potentially dangerous risks of always-listening devices and with few visible indicators, these could enable unprecedented surveillance if abused.

Spotify tests AI translation and voice cloning for podcasts

Last week, it was announced that Spotify is piloting an AI system that can automatically translate selected podcasts into other languages like Spanish and German, while retaining the original speaker's voice, using voice cloning technology. The program aims to introduce more language diversity to podcasts but could introduce translation errors that are difficult for non-native listeners to detect. This feature is being piloted with some of the platform's most popular podcasts including Dax Shepard, Monica Padman, Lex Fridman, Bill Simmons, and Steven Bartlett.

It's important to note that this feature isn't welcomed by all podcasters alike. Some are rightfully worried about what this means for the future of voice-cloning in a future where scam calls are more likely to take place when voices can easily be faked. It seems that this feature so far is limited to a few languages only and requires opt-in from the podcaster.

The uncanny valley of Mark Zuckerberg and Lex Fridman

A few weeks back, Meta CEO Mark Zuckerberg demonstrated Meta's major advances in photorealistic avatar technology during a VR conversation with podcaster Lex Fridman. Through the use of detailed 3D facial scans and AI analysis of expression movements, the new avatars were able to realistically translate subtle nuances in expression, even surprising the usually stoic Fridman with their human-like quality. Zuckerberg noted the avatar creation process will soon be streamlined using smartphone cameras and AI to extrapolate full facial representations from only a few minutes of scan data. As the race between the FAANG companies heat up, it appears that Meta is entering the race in full force ("insert pun here about Mark Zuckerberg's jiu-jitsu training").

What are we up to?

Blender tutorial + starter file

We published a new Blender walkthrough of our Alice Add-on last week complete with a starter file to get you started with making holograms in no time! 

Uncle Rabbit gets a couple new friends

We also just released v1.4.0 of Liteforms. This release is our first major update and also a long time coming. It introduces a major new feature that many of you have been asking for - customizable Liteforms! Users can now pick from two new avatars and set their name, pronouns, personality, and voice (from 18 presets). Head on over here to give it a spin.

Other links

  • Custom interactivity powered by Blender is coming to Mozilla Hubs (here)
  • Assistant with Bard: A step toward a more personal assistant (here)
  • Anthropic in Talks to Raise $2 Billion From Google and Others Just Days After Amazon Investment (here)
  • Getty Images Joins the Generative AI Game (here)
  • Meta launches Quest 3 (here)
  • John Carmack remains "unconvinced" that MR applications are a driver for increasing headset sales (here)

That's all for this week, folks!

Hope you have a great one and to the future!

-Nikki