Real-time video call AI with seeing, hearing, and speaking