Abstract

Character animation often requires a delicate balance between dynamic motion and the preservation of local structures and temporal coherence. Achieving this balance in free-form character animation, where intricate motions and topological consistency must be maintained, poses significant challenges that traditional methods struggle to overcome. In this work, we introduce FlexiClip, a novel animation framework designed to generate smooth, natural animations while preserving the spatial integrity of keypoints and maintaining temporal consistency across frames. At the core of our approach is the use of residual Jacobians, learned via Neural Ordinary Differential Equations (ODEs), to correct temporal errors without compromising geometric precision. We propose a novel formulation that constrains the ODE to focus exclusively on temporal corrections aggregated over previous frames. To achieve this, we employ a flow matching loss function, which effectively reduces temporal noise and ensures seamless transitions throughout the motion sequence. Extensive experiments and ablation studies demonstrate that FlexiClip outperforms existing methods by producing animations that are smooth, natural, and geometrically consistent. Additionally, our framework supports complex motions involving rotational dynamics, handles multiple conditions specified in text prompts, animates multiple objects within a single image, and generates longer animation sequences. These capabilities highlight its significant improvements in addressing diverse and challenging animation scenarios.

How does it work?



FlexiClip is a novel approach for creating smooth, temporally coherent, and geometrically consistent animated clipart. It extends the state of the art by addressing key challenges such as noise accumulation and structural consistency across frames. For an initial clipart image with M keypoints, FlexiClip assigns M cubic Bézier curves to represent spatial motion trajectories, parameterized as {c(i)}i=0M-1. FlexiClip introduces temporal Jacobians to incrementally correct motion dynamics over time, utilizes probability flow ODE (pfODE) for continuous-time temporal corrections, and employs a flow matching loss inspired by GFlowNet to reduce temporal noise effectively. By integrating these innovations with video Score Distillation Sampling (SDS), FlexiClip enables the generation of natural, consistent animations, even for complex, non-rigid motions.

Varying the Prompts


woman_dance
We can alter the prompts to generate different movements.

Multi-Layer Animation


Comparision with AniClipart


Comparisons to T2V/I2V Models


parrot
crab
crab
crab
We compare our method to four baselines: Four Text/Image-to-Video (T2V/I2V) diffusion models (AnimateLCM-I2V, LTXVideo, DynamiCrafter and Pyramid Flow).

Ablation Study