GenAI - VinAI

Generative AI

Our approach to Generative AI (GenAI) and Large Language Models (LLMs) is driven by our vision to make AI accessible to everyone.

From world-class research that pushes the boundaries of what’s possible with AI, to customized AI models and exceptional engineering, we have been moving aggressively to make AI faster, greener, and more efficient for people, businesses everywhere while addressing regional languages and needs.

Foundation Models

Democratizing foundational models shows our dedication to the AI community.

PhoGPT

LLM for Vietnamese

A new LLM developed for the Vietnamese nuances. PhoGPT is highly competitive compared to the closed-source GPT-4 model.

PhoWhisper

Automatic Speech Recognition for Vietnamese

Available in 5 versions: base, tiny, small, medium, large. Its robustness is achieved through fine-tuning the multilingual Whisper on an 844-hour dataset that encompasses diverse Vietnamese accents.

SwiftBrush

Instant Image Generation

SwiftBrush v2 sets a new standard in AI image generation. Unlike other models, it creates high-quality images (low FID) in a single step, surpassing both Generative Adversarial Network (GAN) and multi-step Stable Diffusion approaches.

Schedule a Demo

Products & Frameworks

Customizing and enhancing the efficiency of Generative AI models for enterprizes and developers

PRAISE

Deliver efficient and secure generative AI solutions tailored for your business needs
Equipped with capabilities for tabular Question and Answer and text-to-SQL queries
Provide comprehensive multilingual support, including low-resource languages

Generative Finetuning

Automatic service for Generative AI developers to build domain-specific foundation model
Include capabilities for fine-tuning end-to-end genAI applications, such as Retrieval-Augmented Generation (RAG) applications

Schedule a Demo

Generative AI Research

Pioneering world-class research in Generative AI and LLMs.

PhoBERT

Pre-trained Language Models for Vietnamese

Available in both "base" and "large" versions, PhoBERT delivers state-of-the-art performance on four key tasks: Part-of-speech tagging, Dependency parsing, Named-entity recognition, and Natural language inference.

XPhoneBERT

Phoneme Representations for 100 Languages and Locales

XPhoneBERT is the first pre-trained multilingual model for phoneme representations for text-to-speech (TTS). It is trained using 330M phoneme-level sentences from nearly 100 languages and locales.

BERTweet

The 1st Public LLM Pre-trained for English Tweets

Trained based on RoBERTa pre-training procedure and a corpus of 850M English Tweets (16B word tokens) in which 5M Tweets related to the COVID-19 pandemic.

Wavelet Diffusion

Fast Sampling Diffusion Model

Wavelet Diffusion introduces a wavelet-based diffusion scheme to enhance the speed of diffusion models compared to GANs, by efficiently processing image components while maintaining generation quality and boosting training convergence with a reconstruction term.v

Anti-DreamBooth

Protect Personal Published Images From Unwanted Uses

Anti-DreamBooth defends against misuse of personalized text-to-image synthesis, like fake news or harmful content generation. It adds noise to user images before publishing, preventing personalization techniques from learning them.