AI purposes not often course of a single clear desk. A mixture of person profiles, chat logs, JSON metadata, embeddings, and typically spatial information. Most groups reply this with a patchwork of OLTP databases, vector shops, and engines like google. ocean base launched seek databasean open supply AI-focused database (underneath the Apache 2.0 license). Seekdb is described as an AI-native search database that integrates relational information, vector information, textual content, JSON, and GIS into one engine and exposes hybrid search and database AI workflows.
What’s a search database?
seek database is positioned as a light-weight, embedded model of the OceanBase engine, focused at AI purposes quite than general-purpose distributed deployments. It runs as a single-node database, helps embedded mode and shopper or server mode, and stays appropriate with MySQL drivers and SQL syntax.
Within the competency matrix, seek database are marked as:
- Embedded database assist
- Standalone database assist
- Distributed databases are usually not supported
Alternatively, the entire OceanBase product covers the distributed case.
From an information mannequin perspective, seek database assist:
- Relational information utilizing customary SQL
- vector search
- Full textual content search
- JSON information
- Spatial GIS information
Every little thing is inside one storage and index tier.
Hybrid search as a core characteristic
The primary characteristic promoted by OceanBase is hybrid search. This can be a search that mixes vector-based semantic search, full-text key phrase search, and scalar filters in a single question and one rating step.
seek database Implement hybrid search by way of a system package deal named DBMS_HYBRID_SEARCH that has two entry factors:
- DBMS_HYBRID_SEARCH.SEARCH returns outcomes sorted by relevance as JSON.
- DBMS_HYBRID_SEARCH.GET_SQL returns the precise SQL string used for execution.
A hybrid search path can:
- pure vector search
- pure full textual content search
- Complicated hybrid search
You’ll be able to push relational filters and joins to storage. It additionally helps question reranking methods resembling weighted scores and cross-rank fusion, and might plug in large-scale language model-based rerankers.
For Search Augmented Technology (RAG) and Agent Reminiscence, this implies you’ll be able to write a single SQL question that performs embedded semantic matching, actual product code or correct noun matching, and relational filtering at person or tenant scope.
Be taught extra about vector and full-text engines
At its core, Seekdb trendy vector and Full textual content stack.
For vectors, seekdb:
- Helps dense and sparse vectors
- Helps Manhattan, Euclidean, dot product, and cosine distance metrics
- Offers reminiscence index varieties resembling HNSW, HNSW SQ, HNSW BQ
- Offers disk-based index varieties together with IVF and IVF PQ
Hybrid vector indexing exhibits learn how to retailer uncooked textual content and have seekdb routinely name an embedded mannequin to have the system preserve the corresponding vector index with out utilizing a separate preprocessing pipeline.
For textual content, seekdb supplies full-text searches resembling:
- Key phrases, phrases, and Boolean queries
- BM25 Relevance Rating
- A number of tokenizer modes
Importantly, hybrid search requires no exterior orchestration as a result of full-text and vector indexes are first-class and built-in into the identical question planner as scalar and GIS indexes.
AI performance within the database
seek database Incorporates built-in AI perform expressions, permitting you to name fashions straight from SQL with out utilizing separate software companies to mediate every name. The primary options are:
- AI_EMBED: Convert textual content to embedded
- AI_COMPLETE: Textual content technology utilizing chat or completion fashions
- AI_RERANK : Rerank the checklist of candidates
AI_PROMPT: Assembles the immediate template and dynamic values right into a JSON object of AI_COMPLETE.
Mannequin metadata and endpoints are managed by the DBMS_AI_SERVICE package deal, and you may register exterior suppliers, set URLs, and configure keys all on the database aspect.
Multimodal information and workloads
seek database Constructed to deal with a number of information modalities on a single node. There is a multimodal information and index layer masking vector, textual content, JSON, and GIS, and a multimodel compute layer for hybrid workloads throughout vector, full-text, and scalar situations.
It additionally supplies a JSON index for metadata queries and a GIS index for spatial situations. This enables queries like the next:
- Discover semantically comparable paperwork
- Filter by JSON metadata resembling tenant, area, class, and so on.
- Constrain by spatial extent or polygon
with out leaving the identical engine.
Seekdb is derived from the OceanBase engine, so it inherits ACID transactions, hybrid row and column storage, and vectorized execution, however large-scale distributed deployments stay the job of a full OceanBase database.
Comparability desk

Necessary factors
- AI native hybrid search:eekdb integrates vector search, full-text search, and relational filtering right into a single SQL and DBMS_HYBRID_SEARCH interface, permitting RAG and agent workloads to carry out a number of sign retrievals in a single question as a substitute of sewing collectively a number of engines.
- Multimodal information in a single engine: Seekdb shops and indexes relational information, vectors, textual content, JSON, and GIS in the identical engine. This enables AI purposes to take care of doc, embedding, and metadata consistency with out sustaining separate databases.
- Database AI features in RAG: AI_EMBED, AI_COMPLETE, AI_RERANK, and AI_PROMPT permit seekdb to name embedded fashions, LLMs, and rerankers straight from SQL. This simplifies the RAG pipeline and strikes extra orchestration logic to the database layer.
- Single-node, embedding-friendly design: Seekdb is a single-node, MySQL-compatible engine that helps embedded and standalone modes, however distributed large-scale deployments preserve a full OceanBase position, making seekdb appropriate for native, edge, and repair embedded AI workloads.
- Open supply and instruments ecosystem: Seekdb is open sourced with Apache 2.0, integrates with a rising ecosystem of AI instruments and frameworks, and might function a unified information airplane for AI purposes with pyseekdb’s Python assist and MCP-based integration of code assistants and brokers.
Please test lipo and project. Please be at liberty to test it out GitHub page for tutorials, code, and notebooks. Please be at liberty to comply with us too Twitter Remember to affix us 100,000+ ML subreddits and subscribe our newsletter. hold on! Are you on telegram? You can now also participate by telegram.
Asif Razzaq is the CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of synthetic intelligence for social good. His newest endeavor is the launch of Marktechpost, a man-made intelligence media platform. It stands out for its thorough protection of machine studying and deep studying information, which is technically sound and simply understood by a large viewers. The platform boasts over 2 million views per 30 days, which exhibits its reputation amongst viewers.

