Building an E-commerce Search Engine with Tantivy (Rust)
Published 2026-03-30 15:40:16 · 55 views
Introduction
Search is the backbone of any e-commerce platform. Users expect fast, relevant, and typo-tolerant results when searching for products. In the Rust ecosystem, Tantivy is a powerful full-text search engine library inspired by Apache Lucene.
In this article, we’ll build a production-grade search system for e-commerce data using Rust and Tantivy.
What is Tantivy?
Tantivy is a high-performance, full-text search engine library written in Rust.
🔑 Features
Full-text search with ranking (BM25)
Schema-based indexing
Fast and memory-efficient
Tokenization and text analysis
Faceting and filtering support
🛒 Use Case: E-commerce Product Search
We want to support:
Search by product name
Filter by category
Sort by price or relevance
Handle typos (“iphnoe” → “iphone”)
📦 Step 1: Setup Project
cargo new ecommerce-search
cd ecommerce-search
Add dependencies:
[dependencies]
tantivy = "0.21"
serde = { version = "1", features = ["derive"] }
🧱 Step 2: Define Schema
Tantivy requires a schema to define searchable fields.
use tantivy::schema::*;
let mut schema_builder = Schema::builder();
let id = schema_builder.add_u64_field("id", STORED);
let name = schema_builder.add_text_field("name", TEXT | STORED);
let description = schema_builder.add_text_field("description", TEXT);
let category = schema_builder.add_text_field("category", STRING | STORED);
let price = schema_builder.add_f64_field("price", STORED);
let schema = schema_builder.build();
🧠 Field Types Explained
TEXT→ full-text search (tokenized)STRING→ exact match (for filtering)STORED→ retrievable in results
📥 Step 3: Index Product Data
use tantivy::{doc, Index};
let index = Index::create_in_dir("./index", schema.clone())?;
let mut writer = index.writer(50_000_000)?;
writer.add_document(doc!(
id => 1,
name => "iPhone 14",
description => "Latest Apple smartphone",
category => "electronics",
price => 999.0
));
writer.commit()?;
🔍 Step 4: Searching Products
use tantivy::query::QueryParser;
let reader = index.reader()?;
let searcher = reader.searcher();
let query_parser = QueryParser::for_index(&index, vec![name, description]);
let query = query_parser.parse_query("iphone")?;
let top_docs = searcher.search(&query, &tantivy::collector::TopDocs::with_limit(10))?;
📊 Step 5: Retrieve Results
for (_score, doc_address) in top_docs {
let retrieved = searcher.doc(doc_address)?;
println!("{:?}", schema.to_json(&retrieved));
}
⚡ Step 6: Add Filters (Category)
let query = query_parser.parse_query("iphone AND category:electronics")?;
🧠 Step 7: Ranking (BM25)
Tantivy uses BM25 scoring by default:
Matches keywords
Boosts relevance
Considers term frequency
👉 No extra work needed—great results out of the box
🔤 Step 8: Tokenization & Text Analysis
Customize tokenizer:
use tantivy::tokenizer::*;
let en_stem = TextAnalyzer::from(SimpleTokenizer)
.filter(LowerCaser)
.filter(Stemmer::new(Language::English));
👉 Helps match:
“running” → “run”
“phones” → “phone”
🧪 Step 9: Typo Tolerance (Fuzzy Search)
use tantivy::query::FuzzyTermQuery;
👉 Enables:
“iphnoe” → “iphone”
📈 Step 10: Sorting by Price
use tantivy::collector::TopDocs;
let top_docs = searcher.search(
&query,
&TopDocs::with_limit(10).order_by_fast_field("price", tantivy::Order::Asc)
)?;
🏗️ Production Architecture
1. Indexing Pipeline
Ingest product data (DB → Tantivy)
Batch indexing
Periodic commits
2. Search API Layer
Use a web framework (like Axum):
/search?q=iphone/search?q=phone&category=electronics
3. Caching Layer
Cache popular queries
Use Redis or in-memory cache
4. Re-ranking Layer (Advanced)
Combine:
Text relevance (BM25)
Business signals (sales, ratings)
🔑 Advanced Features
Faceted Search
Filter by category, brand, price range
Autocomplete
Prefix queries
Synonyms
“mobile” = “phone”
⚠️ Common Pitfalls
Not storing fields → cannot return results
Over-indexing large text → memory overhead
Frequent commits → performance hit
🎯 When to Use Tantivy
Choose Tantivy if:
You need embedded search (no external service)
You want high performance in Rust
You want full control over indexing
🧠 Comparison
| Feature | Tantivy | Elasticsearch |
|---|---|---|
| Language | Rust | Java |
| Deployment | Embedded | Distributed |
| Performance | Very high | High |
| Complexity | Low | High |
Conclusion
Tantivy is a powerful and efficient choice for building search engines in Rust. With features like BM25 ranking, tokenization, and filtering, it can power real-world e-commerce search systems with excellent performance.
If you want full control and Rust-native performance, Tantivy is an excellent choice.