TwelveLabs Company Analysis

Deep Dive · AI / Video Intelligence Analysis

TwelveLabs

A video-native foundation model company that treats video as its own signal rather than a byproduct of text — five years after founding, TwelveLabs closed a $100M Series B co-led by NEA and NAVER Ventures with Amazon participating, formally advancing its “Video Superintelligence” roadmap

$100M Series B Raised (Jul 2026)

$207M+ Total Capital Raised

30,000+ Cumulative Developer Users

2021 Founded (San Francisco)

👤

Section 01

Founder Background & Origin Story

TwelveLabs (legal entity: Twelve Labs, Inc.) is a video-focused multimodal AI company founded in January 2021 in San Francisco. At a time when the broader generative AI industry was advancing primarily around text and images, TwelveLabs chose a deliberately narrow path — building purpose-built foundation models for “video understanding,” an area that had received comparatively little research attention. Over five years, this focus has made it one of the leading video AI companies, earning the confidence of global capital including Amazon, NEA, and NAVER Ventures. The company is headquartered in the San Francisco Bay Area, with an APAC office in Seoul.

👨‍💻

Jae Lee

Co-Founder & Chief Executive Officer

Born in Seoul, Jae Lee later moved to the United States, attending Phillips Exeter Academy (2009–2013) before studying computer science at UC Berkeley (2013–2017). Before founding TwelveLabs, he fulfilled his mandatory military service with the Republic of Korea’s Cyber Operations Command under the Ministry of National Defense, serving as a lead data scientist applying machine learning to national-scale problems — an experience that directly shaped his conviction about video understanding. He also gained industry experience as a software engineering intern at both Amazon and Samsung. While serving in the cyber unit, discussions with fellow service members focused on AI led him to recognize a clear gap: while the industry was pouring resources into text and image processing, video — a rapidly growing and increasingly dominant data type — remained comparatively understudied. This insight became the direct catalyst for founding TwelveLabs in 2021. He currently serves on the board of South Korea’s foundation model industry association, coordinating collaboration with major Korean conglomerates including Samsung, SK, and LG.

Aiden Lee

Co-Founder & CTO

Leads TwelveLabs’ overall technical architecture and directs research and development of its core foundation models, Marengo and Pegasus. His research focus is video understanding and video foundation models, and he leads the company’s global research organization from a base in Seoul.

Soyoung Lee

Co-Founder & Head of Business Development

A graduate of Korea University with a software engineering background at leading Korean and U.S. technology companies. She drives enterprise growth strategy and strategic partnerships, forming a core pillar of TwelveLabs’ go-to-market organization.

Dave Chung

Co-Founder & COO, GTM Asia-Pacific

A graduate of Yonsei University, Dave Chung oversees company-wide operations and Asia-Pacific business expansion, and has personally led major Korean customer projects such as the UNICEF Korea deployment.

Sungjun Kim (SJ Kim)

Co-Founder & Head of Engineering

Oversees the engineering organization behind TwelveLabs’ large-scale video indexing and search infrastructure, building scalable systems capable of reliably processing and serving millions of hours of video, and underpinning the team’s overall execution capability.

TwelveLabs’ founding philosophy starts from the recognition that events in the world unfold through motion, not through text. Language is a retrospective compression of reality created by humans, and in that compression, raw sensory information — shape, movement, sound, sequence — is lost. While most AI systems are trained on this “compressed output,” TwelveLabs chose from its earliest days not to rely on language models that skim a handful of sampled frames or on captions, but instead to build a native representation system capable of perceiving, indexing, searching, and reasoning over video as it actually is. Five years later, this principle remains the central thesis running through every product and research direction at the company.

🎬

Section 02

Business Status & Product Portfolio

Rather than reducing video to frame-level metadata or subtitles, TwelveLabs treats visual, audio, speech, and on-screen text signals as a single unified representation — the foundation of what the company calls its “Video Cognition System.” This system is designed around three interacting components — Perception, Memory, and Reasoning — with the goal of transforming video archives from passive storage into machine-readable memory addressable down to the exact second. As of May 2026, the company had 192 employees company-wide, and in addition to its dual-hub structure in San Francisco and Seoul, it plans to open new offices in New York and London following its Series B raise.

30,000+ Cumulative Developer Users

192 Employees (as of May 2026)

4 Cities SF · Seoul · New York · London

95% Search-Time Reduction, UNICEF Korea

Core Foundation Models & Product Lineup:

🧠

Marengo

Multimodal Embedding Model

An embedding model that maps visual, audio, speech, and on-screen text signals into a single searchable representation space. Marengo 3.0, released in December 2025, introduced a new multi-vector approach that substantially improved any-to-any search accuracy across text, image, and video, and is also distributed through Amazon Bedrock.

💬

Pegasus

Video-to-Language Generation Model

Built on the representations Marengo generates, Pegasus is a video-to-text model that produces grounded descriptions, answers, and summaries based on video content. It supports natural-language Q&A, automatic chapter generation, highlight extraction, and structured (JSON) metadata output for downstream workflows.

🔎

Embed · Search · Analyze

API & Playground

Developers access three core capabilities — Embed, Search, and Analyze — through a usage-based API and an interactive Playground. Initial indexing is offered free of charge, and customers can fine-tune models on their own data through built-in customization options.

The Three-Layer Structure of the Video Cognition System: ① Perception — Marengo converts raw video into a meaningful representation without prematurely reducing it to text. ② Memory — once a video enters the system, it is understood a single time and converted into a persistent representation that remains addressable down to the exact second of a specific file. ③ Reasoning — the system compares and searches patterns distributed across hundreds of broadcasts or an entire season, rather than a single clip, to produce evidence-based conclusions. TwelveLabs frames this structure not as a model demo but as an architecture for converting video into computable data.

Distribution Channels & Ecosystem Partnerships: TwelveLabs’ models are formally integrated into Amazon Bedrock, and are also connected as embedding services within Databricks’ Mosaic AI Vector Search and Snowflake’s Cortex AI. From its earliest research stages, the company formed a multi-year partnership with Oracle Cloud, securing infrastructure built on NVIDIA H100 and L40S GPUs, and in February 2026 became one of the first companies in Korea to deploy NVIDIA Blackwell Ultra B300 in production via AWS. The company has also formed partnerships with LG CNS (July 2025, joint development of AI video solutions) and VAST Data (February 2026, extending customer-managed deployment beyond the public cloud), while integrating with leading media asset management (MAM) providers such as Mimir, Iconik, and Adobe.

Flagship Customer Case Study — UNICEF Korea Media Archive Transformation (April 2026): UNICEF Korea’s archive of more than 8TB of photos and video, accumulated over decades and scattered across individual PCs and network storage, was migrated to Amazon S3 and fully indexed using TwelveLabs’ video-native AI. Staff can now use natural-language queries such as “children collecting water at a field site in Africa” to instantly retrieve precise, timestamped results, reducing search time by approximately 95%. This marked TwelveLabs’ first deployment in the non-profit sector, demonstrating an expansion beyond media and enterprise into any organization managing large, mission-critical archives. Similar archive-reuse projects are underway with major Korean broadcasters such as SBS.

☁️

Formal Amazon Bedrock Integration

AWS AI Competency Achieved

🏛️

In-Q-Tel Strategic Investment

U.S. intelligence-linked venture fund — a trust asset in public sector and security

🏆

CB Insights AI 100

Named for three consecutive years (also a Fast Company Most Innovative Companies honoree)

💰

Section 03

Funding History

Since its 2021 founding, TwelveLabs has raised more than $207 million cumulatively across five rounds — Seed, a Seed Extension, Series A, a Strategic Round, and Series B. Deep-tech specialist VCs Index Ventures and Radical Ventures led the earliest seed-stage rounds, and institutional capital began flowing in earnest with the 2024 Series A, co-led by NEA and NVIDIA’s NVentures. The late-2024 Strategic Round brought in infrastructure and telecom players — Databricks, Snowflake, and SK Telecom — as strategic investors paired with commercial partnerships, and the 2026 Series B, co-led by NEA and NAVER Ventures with Amazon participating directly, completed a distinctive capital structure that combines U.S. hyperscaler capital with Korean strategic capital simultaneously.

January 2021

TwelveLabs Founded — Development of Video Foundation Models Begins

Founding Capital (Undisclosed)

Jae Lee co-founded the company in San Francisco alongside Aiden Lee, Soyoung Lee, Dave Chung, and Sungjun Kim. The team signed a multi-year cloud partnership with Oracle to secure the GPU infrastructure needed for early model training, and began development of a multi-billion-parameter video foundation model.

March 2022

Seed Round — Led by Index Ventures and Radical Ventures

$5M

Index Ventures led the initial seed round with Radical Ventures co-participating, securing the company’s first institutional investment. While still in closed beta, the capital was used to validate an early platform capable of extracting visual, audio, text, and speech context from video and mapping the relationships among them.

Index Ventures Radical Ventures

December 2022

Seed Extension — Led by Radical Ventures, Bringing Cumulative Seed Funding to $17M

$12M

Radical Ventures led the extension, with Index Ventures re-participating and Jeffrey Katzenberg’s WndrCo and Spring Ventures joining as new investors. Notable angels including Algolia founder Nicolas Dessaigne and Weights & Biases CEO Lukas Biewald also participated. Radical Ventures partner Rob Toews joined the board, and — building on search-algorithm accuracy that had more than doubled over the prior six months — the company accelerated development of a multi-billion-parameter foundation model dedicated to video.

Radical Ventures (Lead) Index Ventures (re-participation) WndrCo (Jeffrey Katzenberg) Spring Ventures

June 2024

Series A — Co-Led by NEA and NVIDIA’s NVentures, Bringing Cumulative Total to ~$77M

$50M

New Enterprise Associates (NEA) and NVIDIA’s venture arm NVentures co-led the company’s first large institutional round, with existing investors Index Ventures, Radical Ventures, WndrCo, and Korea Investment Partners all re-participating. TwelveLabs integrated NVIDIA H100 and L40S GPUs along with the Triton Inference Server and TensorRT extensively into its platform, and used the proceeds to expand R&D and nearly double headcount by hiring more than 50 new employees by year-end.

New Enterprise Associates (Co-Lead) NVIDIA NVentures (Co-Lead) Index Ventures · Radical Ventures · WndrCo · Korea Investment Partners

December 2024

Strategic Round — Databricks, SK Telecom, Snowflake, and Other Infrastructure Players Join, Bringing Cumulative Total to $107M

$30M

Global data and infrastructure companies Databricks and Snowflake Ventures, along with Korean telecom operator SK Telecom, plus HubSpot Ventures and In-Q-Tel (IQT), each made a strategic investment. Databricks and Snowflake simultaneously announced integrations bringing TwelveLabs’ embedding service into Mosaic AI Vector Search and Cortex AI respectively, while SK Telecom pursued joint development of next-generation AI services. Alongside this round, TwelveLabs hired Dr. Yoon Kim — former CTO of SK Telecom and a key architect of Apple’s Siri — as President and Chief Strategy Officer, strengthening its push into the global enterprise market.

Databricks Snowflake Ventures SK Telecom HubSpot Ventures In-Q-Tel (IQT)

July 1, 2026 (Most Recent) — Series B

Co-Led by NEA and NAVER Ventures, Amazon Participates — Cumulative Total Exceeds $207M; Multi-Year AWS Trainium Agreement Signed

$100M

Round Structure: NEA and NAVER Ventures co-led the round, with Amazon joining alongside existing investors Radical Ventures, Index Ventures, and Korea Investment Partners, plus new participants Quadrille Capital and Red Bull Ventures. Notably, NAVER Ventures — for whom TwelveLabs was its very first investment — expanded its position to co-lead the Series B, reflecting a deepening of confidence from Korean corporate capital.

Strategic Significance: Alongside the investment, Amazon’s cloud division AWS signed a multi-year agreement to host TwelveLabs’ compute workloads on its proprietary Trainium AI chips, and committed to offering new models to developers through Amazon Bedrock. Proceeds will fund R&D across the San Francisco and Seoul hubs, along with new offices in New York and London to serve growing global enterprise demand.

Positioning Shift: With this round, TwelveLabs stated it is expanding beyond providing video understanding models toward becoming a full-stack agentic video intelligence system that combines perception, knowledge, and reasoning into a single architecture.

New Enterprise Associates (Co-Lead) NAVER Ventures (Co-Lead) Amazon (Multi-Year AWS Trainium Agreement) Radical Ventures · Index Ventures · Korea Investment Partners Quadrille Capital · Red Bull Ventures (New)

🌐

A Dual U.S.–Korea Strategic Capital Structure — Hyperscalers and Korean Corporate Capital, Combined

TwelveLabs’ investor roster is unusual in bringing together U.S. venture and Big Tech capital — NEA, Amazon, Radical Ventures, Index Ventures — alongside Korean strategic capital including NAVER Ventures, SK Telecom, Samsung, and Korea Investment Partners. The mix is further distinguished by unconventional strategic backers such as Firstman Studio, the production company of “Squid Game” director Hwang Dong-hyuk ($3M, October 2025), and In-Q-Tel, a venture fund linked to U.S. intelligence agencies — extending the company’s strategic network from media and entertainment all the way to the public sector and security.

🏆

Section 04

Core Competitive Advantages

TwelveLabs’ competitive defensibility rests on a video-native architecture structurally differentiated from general-purpose multimodal models that rely on frame sampling or captioning; cross-hyperscaler integration that avoids single-cloud dependency; trust capital built in the public sector and defense; and a dual growth engine that combines U.S. and Korean strategic capital simultaneously.

🧬

Video-Native Foundation Model Architecture — The Marengo/Pegasus Dual Structure

Unlike general-purpose multimodal models such as Google Gemini, or frame-level metadata tools such as Google Cloud Video AI, AWS Rekognition, and Azure Video Indexer, TwelveLabs invested more than three years from inception building an embedding model (Marengo) and a generative model (Pegasus) trained to learn video’s native representation from the ground up. Rather than performing a “one-time inspection” of video only at query time, the architecture understands and permanently indexes video the moment it enters the system, converting it into machine-readable memory addressable down to the exact second — a foundational technical gap competitors cannot easily replicate in the short term.

☁️

A Distribution Moat Built on Cross-Hyperscaler Integration

TwelveLabs’ embedding services are integrated across multiple cloud and data platforms — Amazon Bedrock, Databricks’ Mosaic AI Vector Search, Snowflake’s Cortex AI, and Oracle Cloud — giving the company broad enterprise reach without dependence on any single cloud provider. The multi-year AWS Trainium hosting agreement signed alongside the 2026 Series B deepened this distribution structure further, providing access to developer and enterprise channels at a scale the company’s own sales organization could not reach alone.

🛡️

Trust Capital in the Public Sector and Defense — In-Q-Tel and the Founding Team’s Military Background

CEO Jae Lee’s service with South Korea’s Cyber Operations Command, along with the strategic investment from In-Q-Tel — a venture fund linked to U.S. intelligence agencies — provide a foundation of trust with government and public-sector customers that is difficult for typical AI startups to build. The company’s business has extended into mission-critical public projects such as real-time threat detection, faster emergency response, and traffic management for municipalities, and it has also successfully entered the non-profit and public sector through its UNICEF Korea deployment.

🌏

Dual U.S.–Korea Strategic Capital and a Seoul–San Francisco Dual-Hub Structure

Securing both U.S. capital — NEA, Amazon, Radical Ventures, Index Ventures — and Korean strategic capital — NAVER Ventures, SK Telecom, Samsung, Korea Investment Partners — as shareholders simultaneously is a clear point of differentiation from competitors. Through the founder’s board seat on South Korea’s foundation model industry association, the company has direct access to the AI ecosystems of major Korean conglomerates such as Samsung, SK, and LG, while maintaining equally strong traction in the U.S. hyperscaler and enterprise market — a structural advantage few peers can match.

Next-Stage Growth Driver — Expansion into a Full-Stack Agentic Video Intelligence System: Beyond providing video understanding models, TwelveLabs has laid out a roadmap to build a complete agentic video intelligence system that combines perception, knowledge, and reasoning into a single architecture. Building on its current strength in media and entertainment — sports leagues, film studios, and major content creators — the company plans to expand into verticals such as advertising, security, sports, and automotive, while strengthening its ability to serve global enterprise customers through new offices in New York and London.

TwelveLabs, Series B $100M

TwelveLabs

이 글 공유하기:

이것이 좋아요:

댓글 남기기응답 취소

Global VC Megadeal Briefing에서 더 알아보기