Top Best LLM Platforms of 2026: Ranking and Comparison

Over the past few years, large language models (LLMs) have evolved from an experimental technology into an everyday working tool for developers, businesses, and ordinary users. Today, choosing a platform is not just a question of how "smart" the model is, but also of cost, API access, licensing, and integration options.

The market has become extremely crowded: alongside the closed flagships from OpenAI, Google, and Anthropic, an entire ecosystem of open models from Meta, DeepSeek, Alibaba, and others is growing. Figuring out exactly what fits your tasks is getting harder and harder.

In this ranking we compare the most popular LLM platforms of 2026 against a single set of criteria: official site, developer, availability of open source, API access, free plan, plans and their pricing, context window, multimodality, flagship models, and the strengths and weaknesses of each. All prices are in US dollars as of June 2026.

Website ratings
#	Name	Domain	Rating
1	ChatGPT	openai.com	4.90
2	Claude (Anthropic)	claude.com	4.80
3	Google Gemini	gemini.google.com	4.70
4	DeepSeek	deepseek.com	4.60
5	Grok (xAI)	x.ai	4.45
6	Qwen (Alibaba)	qwen.ai	4.40
7	Llama (Meta)	llama.com	4.25
8	GLM (Zhipu AI)	z.ai	4.20
9	Kimi (Moonshot AI)	moonshot.ai	4.15
10	Mistral	mistral.ai	4.10
11	Cohere Command	cohere.com	4.00
12	ERNIE (Baidu)	yiyan.baidu.com	3.90
13	Nova (Amazon)	aws.amazon.com	3.90
14	Hunyuan (Tencent)	hunyuan.tencent.com	3.80
15	Phi (Microsoft)	azure.microsoft.com	3.80
16	Yi (01.AI)	01.ai	3.70
17	OLMo (Ai2)	allenai.org	3.60
18	Reka	reka.ai	3.50
19	DBRX (Databricks)	databricks.com	3.40
20	Molmo (Ai2)	molmo.allenai.org	3.40
21	Krutrim	krutrim.ai	3.00

GPT (OpenAI) — the world's most popular LLM platform

OpenAI remains the undisputed market leader and sets the pace for the entire industry. Its ChatGPT product has become synonymous with artificial intelligence for hundreds of millions of users, and the GPT model lineup is the industry benchmark that every competitor is measured against.

Official site: openai.com

Developer: OpenAI (USA)

Open source: Mostly closed (proprietary). Flagship GPT models are not publicly available, although OpenAI has released separate open-weight models in the gpt-oss series.

API access: Yes, a full REST API with per-token billing. GPT-5.5 costs $5 per million input / $30 per million output tokens; cheaper Mini and Nano models are available, plus a Batch API (-50%) and context caching.

Free plan: Yes, with access to the GPT-5.3 model and a limit of roughly 10 messages per 5 hours. As of February 2026, ads have appeared on the free tier.

Subscription plans & pricing:

Free: $0 — access to GPT-5.3 with limits and ads
Go: $8/mo — higher limits, also with ads
Plus: $20/mo — access to GPT-5.5 with expanded limits
Pro: $100/mo — 5x higher limits than Plus + exclusive GPT-5.5 Pro
Pro (max): $200/mo — 20x higher limits and up to 1M token context
Business: $25–30 per user/mo (min 2 seats) — SSO, SOC 2, admin tools
Enterprise: Custom pricing — data residency, fine-tuning, SLA

Context window: Up to 1,000,000 tokens (on the Pro $200 tier and via API); less on lower tiers

Multimodality: Yes — text, images (analysis and generation via DALL·E), voice (Advanced Voice), and video via Sora

Flagship models: GPT-5.5, GPT-5.5 Pro, GPT-5.4 Mini, GPT-5.4 Nano, GPT-5.3 (free tier); open-weight gpt-oss series

Strengths:

The largest ecosystem of integrations, plugins, and third-party apps
Consistently among the strongest models for reasoning, code, and writing
The broadest community, documentation, and third-party support
Full multimodality in a single platform (text, images, voice, video)

Weaknesses:

Closed license — no access to flagship model weights
Top tiers ($100–200/mo) are pricier than most competitors
Ads on the Free and Go tiers
A ChatGPT subscription does not include API access — it's billed separately

OpenAI is the safest "default" choice: maximum brand recognition, the broadest integration ecosystem, and consistently strong models. The trade-offs are a closed license, relatively high pricing on top tiers, and the arrival of ads on the free tier.

Claude (Anthropic) — leader in code, analysis, and safety

Claude by Anthropic is OpenAI's main competitor, especially prized for writing quality, working with large documents, and programming. Anthropic emphasizes safety and predictability through its Constitutional AI approach.

Official site: claude.com

Developer: Anthropic (USA)

Open source: Closed (proprietary). Claude model weights are not available; access is only through the app and the API.

API access: Yes, a full API with per-token billing. Sonnet 4.6 is $3/$15 per million, Opus is about $5/$25 per million; prompt caching and a Batch API with discounts of up to 90% are supported.

Free plan: Yes, a free tier with access to current models and limited usage allowances.

Subscription plans & pricing:

Free: $0 — basic access with limits
Pro: $20/mo ($17/mo billed annually) — expanded limits, Claude Code
Max 5x: $100/mo — 5x more usage than Pro
Max 20x: $200/mo — 20x more usage, for heavy workloads
Team: $25–30 per user/mo (min 5 seats) — SSO, admin console
Enterprise: Custom pricing — enhanced security, data residency

Context window: Up to 1,000,000 tokens — one of the largest among closed models, ideal for documents

Multimodality: Partial — accepts text and images as input; no native image, audio, or video generation

Flagship models: Claude Opus 4.8 (flagship), Claude Opus 4.7, Claude Sonnet 4.6, Claude Haiku 4.5

Strengths:

One of the world's strongest for programming (Claude Code) and long documents
Large context window of up to 1M tokens
Measured, careful responses and a strong safety focus
Handy Projects and tools for team collaboration

Weaknesses:

No image, video, or voice generation
Top Max tiers ($100–200/mo) — monthly billing only
Smaller plugin ecosystem than OpenAI
Sometimes overly cautious on sensitive topics

Claude is the best choice for those who work heavily with code and long texts and need a careful, well-reasoned assistant. Its weaker points are the lack of image/video generation and relatively expensive top-tier Max plans.

Gemini (Google) — deep integration with the Google ecosystem

Gemini is Google's flagship model lineup, tightly integrated with Workspace, Android, Chrome, and Search. It is one of the strongest platforms for multimodality, especially when working with video and audio.

Official site: gemini.google.com

Developer: Google DeepMind (USA)

Open source: Closed (proprietary). Google separately develops the open Gemma lineup, but the Gemini models themselves are closed.

API access: Yes — the Gemini API and Vertex AI. Gemini 3.5 Flash is $1.50/$9.00 per million tokens; more expensive Pro models and Batch tiers are available.

Free plan: Yes, a free tier on the Gemini 3 Flash model with access to newer models within a daily quota.

Subscription plans & pricing:

Free: $0 — Gemini 3 Flash with daily limits
Google AI Plus: $7.99/mo — higher limits and access to newer models
Google AI Pro: $19.99/mo — flagship models, 1M context, Deep Research
Google AI Ultra: from $99.99/mo — ~5x Pro limits, 20+ TB storage
Google AI Ultra (top): $200/mo — up to 20x Pro limits, premium features
Workspace: Gemini built into paid Google Workspace business tiers

Context window: Up to 1,000,000 tokens on paid tiers and via API

Multimodality: Yes, natively — text, images, audio, and video; one of the strongest platforms for multimodal tasks

Flagship models: Gemini 3.5 Pro, Gemini 3.5 Flash, Gemini 3 Flash (free), Gemini Omni (generation from any input)

Strengths:

Deep integration with Gmail, Docs, Android, Chrome, and Google Search
The strongest multimodality, especially video and audio
A generous free tier and an affordable entry subscription ($7.99)
Large context window and a powerful Deep Research mode

Weaknesses:

Confusing plan structure and frequent name changes (Advanced → AI Pro/Ultra)
Staged model rollouts: different users see different versions
Regulatory and regional restrictions on feature availability (especially in the EU)
Closed weights — no self-hosting of flagships

Gemini is the optimal choice for those already living in the Google ecosystem: Gmail, Docs, YouTube, and Android. Strong multimodal models and a generous free tier offset some confusion in plan naming and regional restrictions.

DeepSeek — the strongest open model for price/quality

DeepSeek is a Chinese lab that upended the market by releasing powerful open models at a fraction of the cost of Western flagships. Its V4 models lead the open-source field in programming and agentic tasks.

Official site: deepseek.com

Developer: DeepSeek AI (China)

Open source: Yes — open weights under the MIT license. The models can be freely downloaded and run independently.

API access: Yes — an inexpensive API (OpenAI- and Anthropic-compatible). V4-Flash is about $0.14/$0.28 per million tokens, V4-Pro is more expensive; aggressive context caching cuts the cost of agentic tasks.

Free plan: Yes — the DeepSeek chat app is free for all users.

Subscription plans & pricing:

Chat app: $0 — free access to models via web chat and the app
API V4-Flash: About $0.14 / $0.28 per million input/output tokens
API V4-Pro: About $0.44 / $0.87 per million tokens (flagship reasoning)
Self-hosting: Free per token — your own infrastructure, MIT license

Context window: Up to 1,000,000 tokens in the V4 models

Multimodality: Mostly text and reasoning; separate DeepSeek-VL models exist for images

Flagship models: DeepSeek V4-Pro, DeepSeek V4-Flash (1M context, MIT); earlier V3.2 and R1 have been retired

Strengths:

Near-flagship quality at an extremely low price
Fully open weights (MIT) — self-hostable with no vendor lock-in
Leads open models in code and agentic tasks
Compatibility with OpenAI- and Anthropic-style API formats

Weaknesses:

Mostly text-focused, weaker multimodality
Privacy and data-storage concerns under the Chinese jurisdiction
Frequent model changes and retirement of old API aliases
Less developed official ecosystem and support

DeepSeek is the best choice for developers and companies that want near-flagship quality for minimal money or with self-hosting options. The limitations are a mostly text-focused approach and trust concerns around the Chinese jurisdiction for data.

Qwen (Alibaba) — the largest and most diverse model family

Qwen (通义千问) by Alibaba Cloud is one of the largest model families in the world: over a hundred models for text, vision, audio, code, and translation. The open Qwen versions have become developer favorites and frequently top open-source leaderboards.

Official site: qwen.ai

Developer: Alibaba Cloud (China)

Open source: Partial — many Qwen3/Qwen3.5 models are open under Apache 2.0, but the newest flagships (3.7-Max, 3.7-Plus) are closed and available only via API.

API access: Yes — via Alibaba Cloud Model Studio (DashScope), OpenAI-compatible. Prices range from $0.05 per million (Turbo) to several dollars for the flagship.

Free plan: Yes — the Qwen Chat app is free; an onboarding quota is available for the API, but there is no longer a permanently free API.

Subscription plans & pricing:

Qwen Chat: $0 — free web chat and app
API (open models): Free self-hosting (Apache 2.0) or hosting from ~$0.05/M
Qwen3.7-Plus: About $0.40 / $1.60 per million tokens (multimodal)
Qwen3.7-Max: About $2.50 / $7.50 per million tokens (flagship, closed)

Context window: Up to 1,000,000 tokens in the larger models

Multimodality: Yes — Qwen-VL and Qwen3.7-Plus support images and video; separate audio and video models exist

Flagship models: Qwen3.7-Max, Qwen3.7-Plus (closed), Qwen3.5 (397B, open), Qwen3-Coder, Qwen3 235B-A22B

Strengths:

The broadest model family for any budget and task
Strong open models under Apache 2.0 with self-hosting options
Very low prices on small and mid-tier models
Multilingual (200+ languages) and powerful code models

Weaknesses:

The newest flagships have become closed — a departure from the open-source strategy
Cancellation of the permanently free API (April 2026)
The large number of models complicates the choice
Data privacy concerns under the Chinese jurisdiction

Qwen suits those looking for a wide selection of models for any task — from tiny to flagship. Note that the newest models (3.7-Max/Plus) have become closed and are available via API only, unlike earlier open releases.

Grok (xAI) — a model with real-time access to X

Grok by Elon Musk's xAI stands out for its tight integration with the X social network, real-time information access, and a less "censored" conversational style. The Grok 4.x lineup has quickly caught up with top models in capability.

Official site: x.ai

Developer: xAI (USA)

Open source: Mostly closed. The weights of the older Grok-1 model were once opened, but the current Grok 4.x models are closed.

API access: Yes — an OpenAI-compatible API. Grok 4.3 is $1.25/$2.50 per million tokens; Grok 4.20 is $2/$6 per million (context up to 2M); up to $175/mo in free credits for participating in the data-sharing program.

Free plan: Yes — a free tier with a limit of roughly 10 requests per 2 hours.

Subscription plans & pricing:

Free: $0 — limited access (~10 requests / 2 hrs)
X Premium: $8/mo — better Grok access within X
SuperGrok Lite: $10/mo — entry-level paid tier with Grok Imagine
SuperGrok: $30/mo ($300/yr) — full access, unlimited requests
X Premium+: $40/mo — Grok + premium X features
SuperGrok Heavy: $300/mo — Grok 4 Heavy and maximum limits
Business: $30 per user/mo — team access and admin tools

Context window: Up to 1,000,000 tokens (Grok 4.3) and up to 2,000,000 in the Grok 4.20 variants

Multimodality: Yes — text, images (generation via Grok Imagine), image recognition, and a voice mode

Flagship models: Grok 4.3 (flagship), Grok 4.20 (variants up to 2M context), Grok 4.1 Fast, Grok 4 Heavy

Strengths:

Real-time information access through integration with X
Large context window (up to 2M) in some models
Generous free API credits for developers
Image and video generation, a less restricted response style

Weaknesses:

A very confusing structure with many plans and staged model rollouts
Some features are tied to the paid X social network
The priciest consumer tier (Heavy) costs $300/mo
Closed weights of the flagship models

Grok is attractive for active X users and those who need access to fresh data and image/video generation. A confusing structure with a multitude of plans and the tie-in of some features to the social network are its main drawbacks.

Llama (Meta) — the standard for open self-hosted models

Llama by Meta was long the #1 open model and still remains the foundation of a huge ecosystem. The Llama 4 lineup is natively multimodal, with a record context window of up to 10M tokens in the Scout model.

Official site: llama.com

Developer: Meta (USA)

Open source: Yes — open weights under the custom Llama Community License (with restrictions for very large services), not fully OSI-compliant.

API access: Meta does not offer its own official API; access is via third-party providers (Together, Fireworks, Groq, AWS) from ~$0.18/$0.59 per million (Scout).

Free plan: Yes — the Meta AI assistant is free in the WhatsApp, Instagram, and Facebook apps; the model weights are also free.

Subscription plans & pricing:

Meta AI: $0 — free assistant in Meta's apps
Self-hosting: Free per token — your own infrastructure, open weights
Llama 4 Scout (hosted): About $0.18 / $0.59 per million tokens at providers
Llama 4 Maverick (hosted): Higher price for higher quality, provider-dependent

Context window: Up to 10,000,000 tokens (Llama 4 Scout) — a record window among open models

Multimodality: Yes, natively — Llama 4 works with text and images from the ground up

Flagship models: Llama 4 Scout (10M context), Llama 4 Maverick, Llama 4 Behemoth (the largest)

Strengths:

Free open weights and full self-hosting capability
A huge ecosystem of tools, fine-tuned versions, and community
A record context window of up to 10M tokens
Native multimodality (text + images)

Weaknesses:

A license with restrictions — not fully free for very large services
No official API from Meta — only third-party providers
Flagship quality trails the best Chinese open models
Requires resources and expertise for self-hosting

Llama is a great choice for companies that want to self-host for free and customize the model. That said, in raw flagship quality it now trails the best Chinese open models, and its license is not fully open.

GLM (Zhipu AI) — a powerful open model strong at code

GLM by Zhipu AI is one of the strongest open models of 2026, consistently sitting near the top of open-source leaderboards alongside Qwen. It is especially valued for programming and agentic scenarios.

Official site: z.ai

Developer: Zhipu AI / 智谱 (China)

Open source: Yes — key GLM models are open (including under MIT) and available for download and self-hosting.

API access: Yes — via the Z.ai and BigModel platforms, OpenAI-compatible, with competitive per-token pricing.

Free plan: Yes — a web chat powered by GLM is available for free with limits.

Subscription plans & pricing:

Web chat: $0 — free access with limits
Self-hosting: Free per token — open weights
API: Per-token billing via Z.ai / BigModel, competitive rates
Enterprise: Custom terms for business

Context window: Up to 200,000 tokens in the larger models

Multimodality: Partial — multimodal versions (GLM-V) exist for working with images

Flagship models: GLM-4.6, GLM-4.5 (open weights), the multimodal GLM-V lineup

Strengths:

One of the strongest open models by quality
Open weights with self-hosting capability
Strong results in code and agentic tasks
Competitive API pricing

Weaknesses:

Lower global recognition than Llama or Qwen
Data privacy concerns (Chinese jurisdiction)
Less official English-language documentation and support
Weaker multimodality than top closed models

GLM is an excellent choice for developers seeking an open alternative for code and self-hosting. Its weaker points are lower global recognition and privacy concerns typical of Chinese platforms.

Mistral — a European platform focused on privacy

Mistral AI is a leading European lab offering a mix of open and commercial models hosted in the EU. Its Le Chat product and focus on GDPR compliance make it a popular choice for European businesses.

Official site: mistral.ai

Developer: Mistral AI (France / EU)

Open source: Partial — the smaller models are open under Apache 2.0; the flagship commercial models are closed.

API access: Yes — La Plateforme with per-token billing. Mistral Large is about $1.50–2/$6–7.50 per million; the smallest models start at $0.15 per million.

Free plan: Yes — Le Chat has a free tier with basic limits.

Subscription plans & pricing:

Le Chat Free: $0 — basic access with limits
Le Chat Pro: About $14.99/mo — expanded limits and features
Le Chat Team: $24.99 ($19.99 annual) per user/mo — shared libraries, admin
Le Chat Enterprise: Custom pricing — private hosting, security
API: Per-token billing via La Plateforme

Context window: Up to 128,000 – 256,000 tokens depending on the model

Multimodality: Partial — the Pixtral lineup works with images; the core models are text-based

Flagship models: Mistral Large, Medium 3.5, Small (open), Magistral (reasoning), Devstral (code), Pixtral (vision)

Strengths:

EU hosting and GDPR compliance — important for European business
A mix of open (Apache 2.0) and commercial models
Strong multilingual support, especially European languages
Very cheap smaller models for large-scale tasks

Weaknesses:

Flagships trail the market leaders in quality
Growing competition from cheaper, stronger Chinese open models
Smaller context window than top competitors
Smaller ecosystem and global recognition

Mistral is the optimal option when data privacy, EU residency, and GDPR are critical. That said, in raw flagship quality the platform trails the leaders, and its new open releases face ever-stiffer competition from China.

Kimi (Moonshot AI) — a record-long context window

Kimi by Moonshot AI gained popularity thanks to its ultra-long context window and strong open models in the K2 series. It is one of the most popular chat assistants in China, geared toward working with large volumes of text.

Official site: moonshot.ai

Developer: Moonshot AI / 月之暗面 (China)

Open source: Yes — the Kimi K2 model was released with open weights (a large MoE architecture).

API access: Yes — an API via the Moonshot platform, OpenAI-compatible, with competitive pricing.

Free plan: Yes — the Kimi app is free for users with limits.

Subscription plans & pricing:

Kimi chat: $0 — free web chat and app
Self-hosting: Free per token — open K2 weights
API: Per-token billing via the Moonshot AI platform
Enterprise: Custom terms for business

Context window: Up to ~2,000,000 tokens — one of the longest contexts on the market

Multimodality: Partial — mostly text, with expanding multimodal capabilities

Flagship models: Kimi K2 (open weights, large MoE), earlier versions of the Kimi lineup

Strengths:

A record-long context for working with large documents
Open weights for the flagship K2 model
A very popular, polished chat assistant
Competitive API pricing

Weaknesses:

Focused mostly on the Chinese market and language
Data privacy concerns (Chinese jurisdiction)
Weaker multimodality than the leaders
Less English documentation and global support

Kimi is worth considering for those who need to process very large documents and who value open weights. The limitations are standard for Chinese platforms: less global support and privacy concerns.

Cohere Command — a platform for business and RAG

Cohere is a company focused exclusively on business and enterprise. Its Command models specialize in corporate scenarios: RAG, knowledge-base search, and secure private deployment.

Official site: cohere.com

Developer: Cohere (Canada)

Open source: Partial — Command R weights were opened for research use; the Command A flagship remains commercial.

API access: Yes — the Cohere API with per-token billing; separate Embed and Rerank models for search.

Free plan: Limited — a free trial API key with limits for developers; there is no consumer free chat.

Subscription plans & pricing:

Trial API: $0 — limited trial access for developers
Production API: Per-token billing for the Command, Embed, Rerank models
North / Enterprise: Custom pricing — private deployment and support
Self-hosting: Possible for open Command R weights under separate terms

Context window: Up to 256,000 tokens

Multimodality: Mostly text; focused on text and search (RAG) scenarios

Flagship models: Command A (flagship), Command R+, Command R, plus Embed and Rerank models for search

Strengths:

Deep specialization in RAG and enterprise search
Strong multilingual support and Embed/Rerank models
A focus on security and private deployment
The North platform for building enterprise AI agents

Weaknesses:

Almost no product for regular users (no chat or subscription)
Lower raw flagship quality than top models
Weak multimodality
An enterprise-only focus raises the barrier to entry

Cohere Command is a niche but strong choice for companies that need RAG, multilingual search, and full control over deployment. For regular users the platform is barely intended — there is no consumer chat or subscription.

ERNIE (Baidu) — a leading LLM platform in China

ERNIE by Baidu is one of the oldest and most widespread LLM platforms in China, integrated into Baidu's search and cloud services. In 2025 Baidu opened the ERNIE 4.5 weights, strengthening its open-source position.

Official site: yiyan.baidu.com

Developer: Baidu (China)

Open source: Partial — the ERNIE 4.5 weights were opened in 2025; some models remain closed.

API access: Yes — via the Baidu AI Cloud (Qianfan) platform with per-token billing.

Free plan: Yes — the ERNIE Bot chat (文心一言) offers free access with limits.

Subscription plans & pricing:

ERNIE Bot Free: $0 — free chat with limits
Self-hosting: Free per token for the open ERNIE 4.5 models
API (Qianfan): Per-token billing via Baidu AI Cloud
Enterprise: Custom terms for business

Context window: Up to 128,000 tokens depending on the model

Multimodality: Yes — multimodal versions exist for working with images and content generation

Flagship models: ERNIE 4.5 (open), ERNIE X1 (reasoning), earlier versions of the ERNIE lineup

Strengths:

A leader in the Chinese market with tight integration into Baidu services
Open ERNIE 4.5 weights
Strong Chinese-language and local-context handling
Mature multimodal capabilities

Weaknesses:

Focused mostly on China and the Chinese language
Harder access and weaker support for foreign users
Data privacy concerns (Chinese jurisdiction)
Lower global recognition of the models

ERNIE is a strong choice for Chinese-language tasks and companies in the Baidu Cloud ecosystem. Outside China the platform is less convenient due to language and regional restrictions and weaker English support.

Nova (Amazon) — a model native to the AWS ecosystem

Amazon Nova is Amazon's own model family, available through AWS Bedrock. It targets AWS enterprise customers, offering a balance of price, speed, and integration with the cloud infrastructure.

Official site: aws.amazon.com

Developer: Amazon (USA)

Open source: Closed (proprietary). Access only through AWS Bedrock.

API access: Yes — via AWS Bedrock with per-token billing and integration with AWS infrastructure.

Free plan: Limited — an AWS free tier and trial credits exist; there is no separate consumer chat.

Subscription plans & pricing:

AWS Free Tier: $0 — limited trial credits within AWS
Bedrock (Micro/Lite/Pro): Per-token billing, price rises with model power
Nova Premier: The most powerful model, highest per-token rate
Enterprise (AWS): Corporate agreements and volume discounts

Context window: Up to 300,000 tokens (Nova Pro/Premier)

Multimodality: Yes — accepts text, images, and video; separately, Nova Canvas (images) and Nova Reel (video)

Flagship models: Amazon Nova Micro, Nova Lite, Nova Pro, Nova Premier; Nova Canvas and Nova Reel for generation

Strengths:

Native integration with AWS Bedrock and Amazon infrastructure
Good price-to-speed ratio for large-scale tasks
Enterprise-grade security and compliance
Multimodality with separate models for images and video

Weaknesses:

Access only through AWS — virtually unused outside the ecosystem
Closed weights, no self-hosting
The flagship trails top models in quality
No simple consumer chat or subscription

Nova is the logical choice for companies already on AWS that want a native, secure model without leaving the ecosystem. Outside AWS the platform is barely used, and the flagship trails the market leaders in raw quality.

Hunyuan (Tencent) — a model for the WeChat and Tencent Cloud ecosystem

Hunyuan by Tencent is a model family integrated into Tencent's products, including WeChat and Tencent Cloud. Tencent has opened some of the models, including versions for 3D generation.

Official site: hunyuan.tencent.com

Developer: Tencent (China)

Open source: Partial — a number of models are open (Hunyuan-Large, Hunyuan-A13B, Hunyuan3D); the flagship versions are closed.

API access: Yes — via Tencent Cloud with per-token billing.

Free plan: Yes — free access is available through Tencent's products with limits.

Subscription plans & pricing:

Chat / Tencent products: $0 — free access with limits
Self-hosting: Free per token for the open Hunyuan models
API (Tencent Cloud): Per-token billing
Enterprise: Custom terms for business

Context window: Up to 256,000 tokens depending on the model

Multimodality: Yes — multimodal models exist, including Hunyuan3D for generating 3D content

Flagship models: Hunyuan-Large, Hunyuan-A13B (open), Hunyuan-Turbo, Hunyuan3D

Strengths:

Integration with Tencent's huge ecosystem (WeChat, games, cloud)
Open weights for some of the models
Unique 3D generation capabilities (Hunyuan3D)
Strong Chinese-language handling

Weaknesses:

Focused mostly on China and the Chinese language
Harder access for foreign users
Data privacy concerns (Chinese jurisdiction)
Lower global recognition of the flagships

Hunyuan is mainly interesting for businesses in the Tencent ecosystem and Chinese-language scenarios. For a global audience the platform is less accessible due to language and regional restrictions.

Phi (Microsoft) — compact models for devices and the edge

Phi by Microsoft is a family of small but remarkably efficient models (SLMs). They prove that carefully curated data lets small models rival much larger ones in reasoning and code — ideal for on-device and edge scenarios.

Official site: azure.microsoft.com

Developer: Microsoft (USA)

Open source: Yes — the Phi models are open under the MIT license, available on Hugging Face and Azure.

API access: Yes — via Azure AI Foundry; they can also be run locally thanks to their small size.

Free plan: Yes — the open weights are free; Azure offers a free tier and trial credits.

Subscription plans & pricing:

Self-hosting: Free — open weights (MIT), can even run locally
Azure AI Foundry: Pay-as-you-go usage in the Azure cloud
Azure Free Tier: $0 — trial credits for testing
Enterprise (Azure): Corporate agreements within Microsoft Azure

Context window: Up to 128,000 tokens in the larger versions

Multimodality: Partial — a multimodal version (Phi multimodal) exists for images and audio

Flagship models: Phi-4, Phi-4-mini, Phi-4-multimodal, and earlier versions of the Phi family

Strengths:

The best quality-to-size ratio among small models
Open weights (MIT) and the ability to run locally
Very low inference cost and high speed
Deep integration with the Microsoft Azure ecosystem

Weaknesses:

Trails the large flagships in raw power
Smaller context window than top models
Limited multimodality
Not designed for the most complex reasoning tasks

Phi is the best choice where speed, low cost, and local, cloud-free operation matter. In raw power these models predictably trail the large flagships, but their quality-to-size ratio is among the best on the market.

Yi (01.AI) — open models from Kai-Fu Lee's team

Yi is a model family from 01.AI, founded by the renowned technologist Kai-Fu Lee. The platform combines open releases with commercial models and targets both the Chinese and global markets.

Official site: 01.ai

Developer: 01.AI / 零一万物 (China)

Open source: Partial — a number of Yi models are open (including under Apache 2.0); commercial versions also exist.

API access: Yes — an API via the 01.AI platform with per-token billing.

Free plan: Yes — free chat access is available with limits.

Subscription plans & pricing:

Chat / Free: $0 — free access with limits
Self-hosting: Free per token for the open Yi models
API: Per-token billing via the 01.AI platform
Enterprise: Custom terms for business

Context window: Up to 200,000 tokens in the larger versions

Multimodality: Partial — multimodal versions (Yi-VL) exist for working with images

Flagship models: Yi-Large (flagship), the open Yi models (Apache 2.0), the multimodal Yi-VL

Strengths:

Strong open bilingual (Chinese-English) models
Self-hosting capability
A team with a strong technical reputation
Competitive pricing

Weaknesses:

Lost some attention amid Qwen, DeepSeek, and GLM
Weaker multimodality than the leaders
Data privacy concerns (Chinese jurisdiction)
Smaller ecosystem and global support

Yi is a solid option for those seeking open bilingual (Chinese-English) models. However, amid the rapid rise of Qwen, DeepSeek, and GLM, the platform has lost some of its former prominence.

OLMo (Ai2) — a fully open model for research

OLMo by the Allen Institute for AI (Ai2) is a truly open model: along with the weights, the training data, code, and checkpoints are published. Its main value is transparency and reproducibility for the research community.

Official site: allenai.org

Developer: Allen Institute for AI / Ai2 (USA)

Open source: Yes — fully open-source: weights, data, training code, and checkpoints under the Apache 2.0 license.

API access: Not as a commercial service; access is via self-hosting or third-party hosting providers.

Free plan: Yes — all artifacts are free to download and use.

Subscription plans & pricing:

Self-hosting: Free — the full set (weights, data, code) under Apache 2.0
Hosting providers: Per-token billing at third-party providers (e.g. Together)
Research / education: Free use for science and education

Context window: Up to 4,096 – 65,000 tokens depending on the version

Multimodality: Mostly text; Ai2 develops multimodality through its separate Molmo lineup

Flagship models: OLMo 2 and earlier versions — with fully open training data and code

Strengths:

Unprecedented transparency: not just weights, but data and code are open
Ideal for research, education, and reproducibility
A fully free Apache 2.0 license
Backed by a reputable non-profit institute

Weaknesses:

Trails commercial flagships in raw quality
Small context window
No ready-made consumer product or official API
Geared toward research rather than mass production use

OLMo is an indispensable tool for researchers and education, where understanding exactly how a model was trained matters. For production tasks it trails commercial flagships in raw quality, but it wins on openness.

Reka — a multimodal platform from an independent lab

Reka is an independent lab that has bet on multimodality from the start: its models work with text, images, audio, and video. The Reka lineup (Core, Flash, Edge) targets different task and device levels.

Official site: reka.ai

Developer: Reka AI (USA / UK)

Open source: Partial — some models (notably Reka Flash) have been opened; the flagship remains commercial.

API access: Yes — an API via the Reka platform with per-token billing.

Free plan: Limited — trial access and free credits are available for testing.

Subscription plans & pricing:

Trial: $0 — trial credits for testing
Self-hosting: Free per token for the open models (Reka Flash)
API: Per-token billing via the Reka platform
Enterprise: Custom terms, including on-premise

Context window: Up to 128,000 tokens depending on the model

Multimodality: Yes, natively — text, images, audio, and video from the ground up

Flagship models: Reka Core (flagship), Reka Flash (open), Reka Edge (for devices)

Strengths:

Strong native multimodality (text, images, audio, video)
Compact models for the edge and devices
Some models are available with open weights
Flexible deployment options, including on-premise

Weaknesses:

A small player with limited resources
Smaller ecosystem and recognition
The flagship trails top models in quality
Less documentation and third-party integrations

Reka is interesting for those who need a compact multimodal model, particularly for edge scenarios. As a small player, it trails the leaders in ecosystem scale and compute resources.

DBRX (Databricks) — an open model for the data platform

DBRX is an open MoE model from Databricks, built primarily for customers of its data and analytics platform. It lets you build custom AI solutions right on top of corporate data in Databricks.

Official site: databricks.com

Developer: Databricks (USA)

Open source: Yes — open weights under the open Databricks license; the model can be downloaded and run.

API access: Yes — via the Databricks platform (Mosaic AI) with integration into data workflows.

Free plan: Yes — the open weights are free; Databricks offers a trial period.

Subscription plans & pricing:

Self-hosting: Free per token — open DBRX weights
Databricks (Mosaic AI): Billing within the Databricks platform
Trial: $0 — trial access to the platform
Enterprise: Databricks corporate agreements

Context window: Up to 32,000 tokens

Multimodality: Mostly text; geared toward working with corporate data

Flagship models: DBRX (an open MoE model, ~132B parameters, ~36B active)

Strengths:

Deep integration with the Databricks data platform
Open weights and the ability to train on your own data
Suitable for corporate analytics scenarios
Reliable support from a major vendor

Weaknesses:

The model has noticeably aged compared to fresh open releases
Small context window
Greatest value only within the Databricks ecosystem
Weak multimodality and no consumer product

DBRX mainly makes sense for companies already on Databricks that want to train models on their own data. As a standalone model it has noticeably aged and trails fresher open releases.

Molmo (Ai2) — an open multimodal model for vision

Molmo is an open multimodal model from Ai2 that specializes in image understanding. Like OLMo, it is valued for its transparency and openness for the research community.

Official site: molmo.allenai.org

Developer: Allen Institute for AI / Ai2 (USA)

Open source: Yes — an open-source multimodal model under the Apache 2.0 license.

API access: Not as a commercial service; access is via self-hosting or third-party providers.

Free plan: Yes — the weights are free to download and use.

Subscription plans & pricing:

Self-hosting: Free — open weights under Apache 2.0
Hosting providers: Per-token billing at third-party providers
Research / education: Free use for science

Context window: Depends on the version; geared toward images rather than long text

Multimodality: Yes — specializes in image understanding (vision-language)

Flagship models: Molmo (an open vision-language model, several sizes)

Strengths:

An open and transparent multimodal model
Strong at image understanding for its size
A free Apache 2.0 license
Backed by the reputable Ai2 institute

Weaknesses:

Narrow specialization — vision only, not a universal assistant
Trails top multimodal models in quality
No ready-made product or official API
Geared toward research rather than production

Molmo is a narrowly specialized but useful tool for researchers in computer vision and vision-language tasks. It is not a universal assistant but rather an open base for experiments and customization.

Krutrim — an LLM platform for Indian languages

Krutrim is an Indian LLM platform from the eponymous company (the Ola group), with an emphasis on supporting dozens of Indian languages. It is an attempt to build a sovereign AI for a market poorly served by Western models.

Official site: krutrim.ai

Developer: Krutrim (Ola) (India)

Open source: Partial — the company has opened individual models for Indian languages; the flagship versions are closed.

API access: Yes — an API via the Krutrim cloud platform with per-token billing.

Free plan: Yes — free access to the assistant is available with limits.

Subscription plans & pricing:

Assistant / Free: $0 — free access with limits
Self-hosting: Free per token for the open models
API (Krutrim Cloud): Per-token billing
Enterprise: Custom terms for business

Context window: Up to 128,000 tokens depending on the model

Multimodality: Partial — multimodal support is being developed; the main focus is on text

Flagship models: Krutrim (flagship for Indian languages), the company's open language models

Strengths:

The best support for dozens of Indian languages and local context
A sovereign solution for the Indian market
Some models are available with open weights
Integration with the Ola ecosystem

Weaknesses:

A young platform that trails the global leaders in quality
A narrow regional focus (India)
Smaller ecosystem, documentation, and support
Weaker multimodality

Krutrim mainly makes sense for tasks where high-quality support for Indian languages and local context is critical. As a young platform it noticeably trails the global leaders in overall quality and ecosystem maturity.