It's been a while since I've posted anything here. I've been rather busy with personal and work matters, and just haven't had a lot of time for development recently. But I won't go into that.
Recently, I've been revisiting my work on getting a decent lip-sync animation solution working in Unity. After a LOT of practice learning how Unity coding works, I feel much more comfortable working with Unity, as opposed to trying to fight their standard workflow.
I should have a nice preview up in the next week or so. The sprite-based version of my little script already has a functional prototype, and I just found out that I'll be able to develop a 3D-based BlendShape version of the script as well. Everything is coming along smoothly.
I'm posting here to possibly gauge interest in a script like this for other developers. I know that it has far more application for cut-scenes and machinima, and is looked down upon somewhat by gameplay purists. But I've always been a bit of an animation enthusiast, and am hoping to get this project up on the Unity Store eventually. My initial plan is to put the basic version up for free, and maybe drum up some publicity, and then monetize it if I expand into a "Pro" version down the line.
Replies
Right now I'm figuring out how to rig up the script to allow the user to select sub-objects instead of just the root object for the automatic animation creation. This would make the system more flexible, and would allow more complex facial animations to all be integrated into the same animation file.
Sadly, no. My internet at home has been out for a few days. I haven't had a chance to properly set up my Asset Store account. Also, I really need to do a little extra work on the website for the company I started. Like many designers, my passion for the project has been superseding the need for proper presentation. You know how it is.
I did post a little code example in the Animation section, but nothing in the Work-in-Progress section yet. Hopefully I'll have some free time tonight.
I've mainly been working at re-factoring this project to adhere to the Unity Asset Store guidelines. One of the stipulations for custom scripts they have is that the inspectors need to use the SerializedObject and SerializedProperty classes in order to tie in the standard undo/redo functionality of the Unity IDE. This changes the way in which my code is structured, but I've been able to work with it.
I've also been testing to insure that fringe bugs won't show up for the end user. I've automated the detection of what kind of renderer is being used, and taken that into account in the GUI. I've also cobbled together a simple system for selecting a sub-object for the animation instead of the object the script is applied to. This will allow the script to support characters who are designed out of multiple objects using Unity's hierarchy. I can think of several instances where that might be useful.
I also threw together a quickie 3D model for testing the 3D mesh deformation. So far everything has been working swimmingly. There are a few more conditions I need to test for, but progress is steady.
I finished the 3D model for testing, and finished creating all of the blendshapes I would need for a proper test. Nothing to write home about in terms of visuals, but he has lips, teeth, and a tongue, and can move pretty well for the purposes of talking.
Yesterday I had my first successful test with the 3D model, and managed to get him talking with a pretty satisfying degree of accuracy. It actually looked better than I had expected. I've never created a talking model before. (never bothered tackling the Source import procedure)
Still quite a few bugs to squish, but I'm feeling really good about where this is going. I'll be working on documentation soon.
Unity Lip Sync Demo
It's just a little Unity Web-Player example of the finished animations. But it gives you an idea as to what this project is capable of. Each of these examples took just a few minutes to record, process, and have running in Unity. None of them have been altered or cleaned up either. (although you can once the animations have been saved)
could this work in realtime for say, a voice chat system?
Unfortunately, no. The processing is very much focused on pre-scripting. There are solutions out there that are geared toward real-time processing, this just isn't one of them.
And with any real-time solution, you're going to have significant problems with lag. Processing the audio for phonetic shapes is always going to take some degree of time, and that time discrepancy will always lead to lack-luster lip syncing results. The only real compromise would be to delay the audio playback slightly in order to give the sound processor time to catch up.
This solution is very much geared toward pre-scripted animated sequences. Cut-scenes and machinima from pre-recorded audio. It's also designed to produce Unity animation files. How the end-user decides to play back those animation files is going to be up to them.
Cheshire Unity plug-in
I still need to produce some more documentation for it. Specifically, I will need to throw up some more web pages, and produce some manner of demonstration video. The good news is there is a release version now that anyone can go and try out. Hopefully with the feedback I get I will be able to refine and improve the software.
I actually did take care of it. But the documentation I created I put on the website. I decided it would be too much trouble to create a full micro-site that could be bundled up for distribution with the package. I have a day job and I'm releasing this thing for no money. On top of that there are plenty of other projects that I'm working on.
The Mad Wonder website has some documentation already, and there will be more to come. But waiting until I could produce a full documentation package would have pushed the release back by a half-month, minimum. I decided it was more valuable for people to get to play around with the software themselves.
Using Cheshire for Unity
It feels great to have someone using something I made and enjoying it. Apparently it is also working pretty well with non-English languages. I wouldn't expect it to work with EVERY language. But it should be decent for any language that uses most of the same basic phonemes.
A big thank you to Vul Gerstal. This really made my day!
i'm gonna have to give this a shot next time i need a lip sync solution
To be honest, I spent a considerable amount of that time out of work. Unemployment sucks. Thankfully, I have managed to secure a new full-time gig, and my after-hours productivity has been off the charts recently.
One of my big time-sinks for that after hours productivity has been an update to Cheshire! With the 4.6 update, and now even the 5.0 update, it's high time that Cheshire got brought up to speed with the latest versions of Unity. I took this lip-sync plug-in back to the drawing board, and essentially started from the ground up. I've been mainly focused on increasing the number of features, as well as breaking it up into logical components for ease-of-use.
I haven't reached a testable state for the animations yet. But the back-end has been coming along swimmingly. One of the nicer new features I'm going to be providing is that I'm bundling the SAPI lipsync program with the plug-in, and I'm going to integrate it into the plug-in itself. For Windows users, there will be no more need for the stand-alone application. They will be able to do their timing-ripping right from the Unity interface. I still have a little more work to do with this feature, but the initial experimentation and testing have been very promising. Running the program from Unity, retrieving the output, and even testing to identify the correct program all work like a charm. The rest is details.
The other big feature that will be coming in the next update is support for a custom animation format that will be serialized in-scene. This is necessary for providing custom play-back of animations that are truly "synced" to audio playback. The current version of Cheshire creates Unity-native animations that can be played back alongside audio, but aren't fully "synced." That true "syncing" is very important for performance-constrained platforms. When testing, I found there was a real issue in playback when deploying to the Unity Web-Player, or mobile devices like Android. (presumably iOS as well) These less robust platforms often have performance issues in decoding compressed audio quickly. There's often a slight delay when starting to play an audio file, and this can noticeably throw off lip-sync animation if the animation isn't designed to take it's cue from the audio.
I've figured out how to create Animation Curves independent of animation files, and I'm confident that I can use these to generate and play-back complicated, smooth animations based on the current time of an audio clip.
The class I wrote to handle retrieving data from the command-line program has worked out very well. It checks everything, insures that it is all running properly, and then gets the string output information. Works like a charm. That feature is pretty much in the bag.
Right now I'm working on writing some parsing functions that will take TextAsset files, and return the timing data that is stored in them. While the command-line timing import feature is neat, it's only available on Windows. And I've always wanted Cheshire to be accessible to as many users as possible. So I want it to be able to parse data from some of the most commonly used lip-sync programs.
Right now, I'm looking at supporting the SAPI command-line program, Cheshire, Moho Switch files, and Papagayo. Does anyone else know of any other lip-sync software whose output they would like to see supported?