You have a VPS. You have users who type “invoice 2021 acme” and expect magic. You also have an ops budget that looks suspiciously like “you, after dinner.” The question isn’t whether OpenSearch can search better than MySQL. It can. The question is whether running it yourself on a small box is a smart constraint… or a slow-motion incident.
This is the practical difference between “search as a feature” and “search as an always-on distributed system that takes personal offense at low RAM.” Pick the wrong tool and your VPS will teach you humility, one swap storm at a time.
The decision frame: what you’re really choosing
If you reduce the decision to “which is faster,” you’ll get an expensive answer and an unreliable system. What you’re choosing is:
- Search quality vs operational complexity. MySQL can do decent keyword search. OpenSearch can do relevance tuning, fuzziness, language analyzers, synonyms, faceting, highlighting, and more. It also wants memory, disk IO, and continuous attention.
- One system vs two systems of record. If you add OpenSearch, MySQL is still your system of record. Now you also have an index that is “eventually correct.” You must build and operate synchronization.
- Failure modes. When MySQL is slow, it’s usually queries, locks, or IO. When OpenSearch is slow, it might be heap pressure, segment merges, cache churn, GC pauses, shard layout, refresh/flush behavior, or a disk that lies about its performance.
- What you can afford to be wrong about. On a small VPS, a wrong assumption can turn into “everything is fine” right up until the kernel OOM-kills Java.
Here’s the blunt guidance:
- If your search is simple (title/body keywords, filters by a few columns, small-ish corpus), start with MySQL and don’t apologize. Use proper indexes, maybe FULLTEXT, and accept that relevance is “good enough.”
- If search is core product value (relevance, typo tolerance, facets, multi-field weighting, synonyms, language), use OpenSearch—but consider managed offerings before self-hosting on a tiny VPS.
- If you still want to self-host OpenSearch on a VPS, do it like you’re running a production service: monitoring, snapshots, heap tuning, disk checks, and a tested recovery plan. Otherwise you’re not self-hosting; you’re improvising.
One quote that belongs on the wall of anyone self-hosting search: “Everything fails all the time.”
— Werner Vogels.
Facts & history that explain today’s pain
Some context makes the tradeoffs feel less arbitrary. These are concrete, useful facts that map directly to operational reality:
- Lucene is the core engine under Elasticsearch and OpenSearch. It’s been around since the late 1990s, and its segment/merge model is why disk IO matters so much.
- MySQL FULLTEXT predates modern “search platforms.” It was built to add keyword search to relational workloads, not to be a relevance lab.
- Elasticsearch started in 2010 as a distributed search server built on Lucene, designed to make “search clusters” approachable—until you had to operate one.
- OpenSearch forked from Elasticsearch 7.10 after licensing changes. Operationally, it inherits most of Elasticsearch’s strengths and footguns.
- Inverted indexes are why OpenSearch is fast at text search. It’s not a “database doing LIKE”; it’s a different data structure optimized for retrieval.
- MySQL’s LIKE ‘%term%’ can’t use a normal B-tree index for leading wildcards, which is why people end up doing table scans and then blaming “MySQL search.”
- Refresh vs commit is a real concept in Lucene-based systems. A document can be “searchable” quickly (refresh) but not necessarily durable until later (commit/snapshot semantics).
- Small VPS disks often have burst credits. Your benchmark might look great until credits run out, and then segment merges turn into sludge.
MySQL as a search engine: what works, what breaks
What MySQL does well for search
MySQL is boring in the best way. For “search-ish” features inside a relational app, it can be exactly right:
- Filtering + sorting + pagination on structured columns is its home turf.
- Transactional correctness. When you write data, it exists. No async index pipeline required.
- Operational simplicity. One service, one backup strategy, one set of SLOs.
- Cost efficiency on small hardware. MySQL doesn’t demand a big JVM heap just to exist.
MySQL search patterns that are acceptable
There are three common patterns, in increasing order of “this is actually search.”
-
Prefix search on normalized fields (usernames, SKUs): use indexes and prefix matching.
Example:
WHERE sku LIKE 'ABC%'with an index onsku. Fast and predictable. -
Token search with FULLTEXT (natural language-ish text fields): use FULLTEXT indexes and BOOLEAN MODE.
This gives you tokenization and scoring, but it’s not a modern relevance platform. It’s closer to “better than LIKE.”
-
Hybrid approach: keep MySQL authoritative, precompute a search column (normalized keywords) for simple matching, and accept limitations.
This is often the winning move when a team is small and uptime matters more than relevance tuning.
What breaks (or becomes painful)
- Substring search (contains) at scale. Users type partial fragments; you reach for
%term%; the CPU fan auditions for a jet engine. - Relevance expectations. “Why is this result ranked above that?” In MySQL, the answer is often “because it matched a token more times,” which is not the same as “it’s what the user meant.”
- Typos, stemming, synonyms, multi-language analyzers. You can approximate some of this, but you will end up inventing your own search engine inside SQL. Don’t.
- High-cardinality facets/aggregations. MySQL can group and count, but interactive faceting with multiple dimensions starts to feel like a denial-of-service against your own database.
MySQL search is a good servant and a terrible messiah.
OpenSearch on a VPS: powerful, needy, occasionally dramatic
What OpenSearch gives you that MySQL doesn’t
- Real relevance tooling. Field boosting, BM25 scoring, phrase queries, proximity, and analyzers that treat human language as something other than “a string.”
- Fuzzy matching for typos and near matches.
- Facets and aggregations built for interactive exploration.
- Highlighting, suggestions, autocomplete patterns that users expect from modern search.
- Denormalized documents optimized for reads, not joins.
What OpenSearch demands from you
OpenSearch is a search engine plus a distributed system. Even as a single node on a VPS, it behaves like both.
- Memory management. JVM heap, off-heap, OS page cache, and the uncomfortable reality that “free memory” is a performance feature, not a waste.
- Disk IO. Segment merges, flushes, translog, snapshots. Cheap disks turn “fast indexing” into “why is load average 40.”
- Index lifecycle decisions. Shard count, refresh intervals, replicas (or no replicas on a single node), retention, and mapping discipline.
- Operational hygiene. Snapshots, monitoring, alerting, rolling upgrades (even single-node needs care), and recovery testing.
First joke (short, and earned): Running OpenSearch on a 2 GB VPS is like adopting a husky in a studio apartment. It’s going to redecorate, loudly.
Single-node reality check
A single node means:
- No redundancy. A bad upgrade, a corrupt disk, or a fat-fingered delete is a full outage unless you have snapshots.
- No shard balancing. If your shard strategy is wrong, you live with it until you reindex.
- Performance is bounded by one machine. That sounds obvious until you realize indexing and querying compete for the same CPU, memory, and IO.
OpenSearch can still be the right choice on a VPS. But you must treat the VPS as a production server, not a hobby box.
Relevance, analyzers, and “why doesn’t it find it?”
MySQL relevance: coarse knobs
MySQL FULLTEXT scoring exists, but it’s not designed for the kind of iterative relevance tuning product teams tend to do. You can do:
- Natural language queries and boolean mode.
- Weighting across columns by splitting and combining scores (awkward, but possible).
- Stopword lists and minimum word length behavior (varies by engine/version).
The biggest trap: expecting MySQL FULLTEXT to behave like modern web search. It won’t, and you’ll waste time trying to make it pretend.
OpenSearch relevance: more power, more responsibility
OpenSearch gives you analyzers (tokenization, lowercasing, stemming), per-field mappings, multi-fields (keyword + text), and query DSL flexibility. You can make search feel “smart.” You can also break it so it finds nothing because the analyzer doesn’t tokenize the way you assumed.
Operational implication: mapping mistakes are outages
A mapping mistake in OpenSearch is often “reindex the world.” On a VPS, reindexing can be the thing that knocks over the node. Plan for it:
- Versioned indices and aliases.
- Controlled reindex jobs with throttling.
- Snapshots before big changes.
Storage and IO: the silent killer on VPS search
On paper, both MySQL and OpenSearch are “disk-based systems.” In practice:
- MySQL tends to be sensitive to random IO patterns (buffer pool hit rate matters), log fsync behavior, and contention from other workloads.
- OpenSearch is sensitive to sustained write throughput during indexing and merges, and to read latency during query load—plus it uses the OS page cache heavily.
On a VPS, the disk is frequently the least honest component. “NVMe” in the plan description can still mean shared IO, burst credits, noisy neighbors, and throttling. Segment merges don’t care about your marketing copy.
Second joke: The cloud says your disk is “up to 3,000 IOPS.” My experience says it’s “up to 3,000 IOPS, briefly, if the moon approves.”
Three corporate mini-stories from the trenches
Incident caused by a wrong assumption: “search is read-only, so it won’t hurt prod”
A mid-sized SaaS team added OpenSearch for customer-facing search. They were careful about query performance in staging, and they were proud of it. The wrong assumption: search traffic is “read-only,” therefore safe to colocate on the same VPS-class instance as the application database during early launch.
The first week was fine. Then a customer imported a large dataset and the team enabled near-real-time indexing: refreshes stayed frequent, indexing ran continuously, and merges started stacking up. Latency rose, but not in the search endpoint first. It showed up in everything else: API timeouts, login slowness, background jobs lagging.
The culprit wasn’t CPU. It was disk contention. The OpenSearch node was doing sustained writes plus merges; the MySQL instance (same host) was trying to fsync its redo log. Both were “correct.” Together they were catastrophic.
The fix was painfully simple: separate the workloads and add explicit disk monitoring. They also increased refresh intervals during bulk indexing and added a backpressure mechanism in the ingest pipeline.
The lesson: “read-only” systems still write, sometimes aggressively. Especially search.
An optimization that backfired: shard count as a performance placebo
An engineering group migrated from MySQL FULLTEXT to OpenSearch because they wanted better relevance and facets. Early benchmarks were mixed. Someone suggested increasing the shard count to “parallelize” queries on a single node. It sounded plausible. It was wrong for their case.
They took a modest index and split it into many shards. Query throughput did not improve. Instead, heap pressure went up: more shard-level data structures, more segment metadata, more caches competing. GC pauses became noticeable under traffic spikes. Indexing throughput dropped because merges were happening across more small shards rather than fewer efficient ones.
They then tried to “fix” it with larger heap. That reduced GC frequency but increased pause times when GC did happen, and starved the OS page cache. Queries got slower again because disk reads increased.
They recovered by reindexing into fewer shards (and using an alias to switch over), keeping heap moderate, and leaving memory for the page cache. The performance improvement wasn’t from a magic number; it was from stopping the system from fighting itself.
The lesson: shard count is not a performance knob. It’s a data distribution decision with operational consequences.
A boring but correct practice that saved the day: snapshots and restore drills
A company ran OpenSearch as a self-hosted single-node service for internal log search and a small customer search feature. Not glamorous. They had one disciplined habit: weekly snapshot restore drills to a scratch instance. No one loved doing it. It was a checklist item that always felt lower priority than “real work.”
One day, after an OS patch and a reboot, the node came up with a filesystem issue that corrupted part of the index store. OpenSearch refused to start cleanly. The team did not debate theory for hours. They declared the node unhealthy, provisioned a new instance, installed the same OpenSearch version, and restored the latest snapshot.
They still had an outage window, but it was measured in a predictable chunk of time rather than in open-ended panic. Search quality was intact because their restore drill validated mappings, templates, and the end-to-end recovery process.
The lesson: snapshots without restore practice are wishful thinking. Boring practice is what turns “data loss” into “annoying afternoon.”
Fast diagnosis playbook
When “search is slow,” you need to answer one question fast: is this CPU, memory, disk, or query design? Here’s a sequence that works in real life.
First: is the host dying?
- Check load, CPU steal (VPS!), memory pressure, swap, and disk saturation.
- If the host is unhealthy, don’t start tuning queries. Fix the platform bottleneck.
Second: is the service stable?
- MySQL: connections, threads, slow queries, InnoDB status, buffer pool behavior.
- OpenSearch: cluster health (even single node), JVM heap pressure, GC pauses, thread pools, rejected requests.
Third: is it an indexing/refresh/merge issue?
- OpenSearch: check indexing rate, refresh interval, merges, and translog.
- MySQL: check write amplification, fsync, and buffer pool hit ratio.
Fourth: is it the query?
- MySQL: EXPLAIN, indexes, full scans, temporary tables, filesorts.
- OpenSearch: profile queries, analyze tokenization, inspect mappings, and validate the DSL.
Fifth: decide containment
- Reduce blast radius: throttle indexing, increase refresh interval, disable expensive aggregations, cap query complexity, or temporarily degrade features.
- Then fix root cause.
Practical tasks: commands, outputs, decisions (12+)
These are real commands you can run on a typical Linux VPS with MySQL and/or OpenSearch. Each includes what the output means and what decision you make next.
1) Check CPU steal and load (VPS reality)
cr0x@server:~$ mpstat -P ALL 1 3
Linux 6.5.0 (server) 12/30/2025 _x86_64_ (2 CPU)
12:01:00 PM CPU %usr %nice %sys %iowait %irq %soft %steal %idle
12:01:01 PM all 22.00 0.00 8.00 3.00 0.00 1.00 18.00 48.00
Meaning: %steal at 18% is your hypervisor taking CPU away. That’s not “your app is slow,” that’s “your VPS is oversold.”
Decision: If %steal is consistently >5–10% during incidents, upgrade/move hosts before doing micro-optimizations.
2) Check memory pressure and swap usage
cr0x@server:~$ free -h
total used free shared buff/cache available
Mem: 3.8Gi 3.3Gi 120Mi 42Mi 420Mi 180Mi
Swap: 2.0Gi 1.6Gi 400Mi
Meaning: You’re swapping heavily. OpenSearch + swap is a slow tragedy; MySQL + swap is also bad, just quieter.
Decision: Reduce OpenSearch heap, reduce workload, add RAM, or move search off-box. If swap is non-zero during steady state, you have an ops problem.
3) Check disk latency and saturation
cr0x@server:~$ iostat -xz 1 3
Linux 6.5.0 (server) 12/30/2025 _x86_64_ (2 CPU)
Device r/s w/s rMB/s wMB/s await %util
nvme0n1 55.0 220.0 2.1 18.4 42.30 97.00
Meaning: 97% util and 42 ms await means the disk is the bottleneck. Search nodes hate this; MySQL hates it too.
Decision: Throttle indexing/merges (OpenSearch), lower write rate (both), move to better storage, or split services.
4) Check filesystem space and inode pressure
cr0x@server:~$ df -h /var/lib
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 80G 74G 2.1G 98% /
Meaning: 98% full is incident territory. OpenSearch needs headroom for merges; MySQL needs space for logs and temp tables.
Decision: Free space now. If OpenSearch, ensure ample free space before heavy indexing or reindex operations.
5) Check kernel-level OOM events
cr0x@server:~$ journalctl -k -S -2h | tail -n 8
Dec 30 10:55:12 server kernel: Out of memory: Killed process 1920 (java) total-vm:6123456kB, anon-rss:2567890kB
Dec 30 10:55:12 server kernel: oom_reaper: reaped process 1920 (java), now anon-rss:0kB, file-rss:0kB
Meaning: The kernel killed OpenSearch (Java). That’s not an application bug; that’s capacity/limits.
Decision: Reduce heap, disable memory-hungry features, move to larger node, or stop self-hosting search on that VPS.
6) Verify MySQL is using indexes (not wishful thinking)
cr0x@server:~$ mysql -e "EXPLAIN SELECT id,title FROM posts WHERE title LIKE '%invoice%' ORDER BY created_at DESC LIMIT 20\G"
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: posts
partitions: NULL
type: ALL
possible_keys: idx_title,idx_created_at
key: NULL
key_len: NULL
ref: NULL
rows: 184233
filtered: 10.00
Extra: Using where; Using filesort
Meaning: type: ALL and key: NULL means full table scan. LIKE with a leading wildcard won’t use a normal index.
Decision: Either redesign the query (prefix search), add FULLTEXT, or accept you need a search engine.
7) Check MySQL slow query log status (and whether it’s lying)
cr0x@server:~$ mysql -e "SHOW VARIABLES LIKE 'slow_query_log%'; SHOW VARIABLES LIKE 'long_query_time';"
+---------------------+-------+
| Variable_name | Value |
+---------------------+-------+
| slow_query_log | ON |
| slow_query_log_file | /var/log/mysql/mysql-slow.log |
+---------------------+-------+
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| long_query_time | 1.000 |
+-----------------+-------+
Meaning: Slow logging is on with 1s threshold. Good: you can observe reality.
Decision: If slow log is off in production, turn it on (carefully) and start measuring before rewriting systems.
8) Inspect InnoDB pressure and buffer pool behavior
cr0x@server:~$ mysql -e "SHOW ENGINE INNODB STATUS\G" | sed -n '1,40p'
=====================================
2025-12-30 12:05:01 INNODB MONITOR OUTPUT
=====================================
BUFFER POOL AND MEMORY
Total large memory allocated 2147483648
Buffer pool size 131072
Free buffers 8
Database pages 130000
Modified db pages 5200
Meaning: Almost no free buffers, many modified pages. Under write load, flushing may dominate latency.
Decision: If disk is saturated, reduce write rate, tune flushing, or move the search workload out of MySQL.
9) Check OpenSearch cluster health (single node still needs green-ish)
cr0x@server:~$ curl -s localhost:9200/_cluster/health?pretty
{
"cluster_name" : "vps-search",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 12,
"active_shards" : 12,
"unassigned_shards" : 12
}
Meaning: Yellow on single-node usually means replicas are unassigned (because there’s nowhere to put them).
Decision: On a single node, set replicas to 0 for indices where you accept no redundancy, and compensate with snapshots.
10) Check OpenSearch JVM heap pressure
cr0x@server:~$ curl -s localhost:9200/_nodes/stats/jvm?pretty | sed -n '1,40p'
{
"nodes" : {
"Q1" : {
"jvm" : {
"mem" : {
"heap_used_in_bytes" : 1638000000,
"heap_max_in_bytes" : 2147483648
}
}
}
}
}
Meaning: ~76% heap used. That’s fine until it’s 90% under load and GC starts camping on your CPU.
Decision: If heap usage is persistently high, reduce shard count, reduce fielddata usage, fix mappings (keyword vs text), or add memory.
11) Check for rejected requests (thread pool saturation)
cr0x@server:~$ curl -s localhost:9200/_nodes/stats/thread_pool?pretty | grep -E '"rejected"|"search"|"write"' -n | head
42: "search" : {
58: "rejected" : 120
210: "write" : {
226: "rejected" : 45
Meaning: Rejections indicate overload. OpenSearch is telling you it can’t keep up.
Decision: Add capacity, reduce query/indexing concurrency, or implement backpressure in the app. Do not “just retry faster.”
12) Inspect OpenSearch merges (disk churn indicator)
cr0x@server:~$ curl -s localhost:9200/_nodes/stats/indices/merges?pretty | sed -n '1,80p'
{
"nodes" : {
"Q1" : {
"indices" : {
"merges" : {
"current" : 7,
"current_docs" : 180000,
"current_size_in_bytes" : 2140000000,
"total_throttled_time_in_millis" : 850000
}
}
}
}
}
Meaning: Many current merges and high throttled time suggests disk is limiting indexing and maybe queries too.
Decision: Increase refresh interval during bulk loads, throttle ingestion, or move to faster disk. Consider fewer shards.
13) Validate analysis/tokenization (why results don’t match)
cr0x@server:~$ curl -s -H 'Content-Type: application/json' localhost:9200/myindex/_analyze -d '{"text":"ACME-Invoice_2021","analyzer":"standard"}'
{"tokens":[{"token":"acme","start_offset":0,"end_offset":4,"type":"<ALPHANUM>","position":0},{"token":"invoice_2021","start_offset":5,"end_offset":17,"type":"<ALPHANUM>","position":1}]}
Meaning: The analyzer kept invoice_2021 as one token. If you expected “invoice” and “2021” separately, your analyzer choice is wrong.
Decision: Change analyzers (or add subfields) before you tune queries. Reindex if necessary.
14) Profile a slow OpenSearch query
cr0x@server:~$ curl -s -H 'Content-Type: application/json' localhost:9200/myindex/_search -d '{
"profile": true,
"query": { "match": { "body": "invoice acme" } },
"size": 10
}' | sed -n '1,60p'
{
"took" : 185,
"timed_out" : false,
"hits" : {
"total" : {
"value" : 12034,
"relation" : "eq"
}
}
}
Meaning: took: 185 ms isn’t horrible, but if p95 is seconds, you need to inspect the profile sections to see which part is expensive (scoring, term queries, fetch phase, etc.).
Decision: If fetch is expensive, reduce stored fields/_source size. If scoring is expensive, adjust query type or pre-filter with keywords.
15) Confirm OpenSearch is actually listening and not flapping
cr0x@server:~$ systemctl status opensearch --no-pager
● opensearch.service - OpenSearch
Loaded: loaded (/lib/systemd/system/opensearch.service; enabled)
Active: active (running) since Mon 2025-12-30 11:40:11 UTC; 25min ago
Docs: man:opensearch
Main PID: 1920 (java)
Tasks: 132 (limit: 4620)
Memory: 2.7G
CPU: 18min 12.345s
Meaning: Service is up, but memory at 2.7G on a small VPS may be a warning sign, not a victory.
Decision: Check heap settings and total RAM. If the process plus OS cache exceeds capacity, you’re heading for OOM.
16) Check MySQL connections and thread pressure
cr0x@server:~$ mysql -e "SHOW STATUS LIKE 'Threads_connected'; SHOW STATUS LIKE 'Max_used_connections';"
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| Threads_connected | 182 |
+-------------------+-------+
+----------------------+-------+
| Variable_name | Value |
+----------------------+-------+
| Max_used_connections | 240 |
+----------------------+-------+
Meaning: High thread counts can mean connection pooling issues or a stuck app. It can also indicate slow queries causing pileups.
Decision: If threads climb during incidents, fix query latency and implement sane pooling limits. Don’t just raise max_connections.
Common mistakes: symptom → root cause → fix
1) “MySQL search is slow”
Symptom: Search endpoint spikes CPU, queries take seconds, DB load jumps.
Root cause: Using LIKE '%term%' on large text columns, causing full scans; or sorting/pagination without supporting indexes.
Fix: Use prefix search where possible; add FULLTEXT; normalize searchable fields; add composite indexes for filters + sort order; consider OpenSearch if substring/typo tolerance is required.
2) “OpenSearch is randomly slow”
Symptom: p95 latency jumps; occasional multi-second queries; CPU looks okay.
Root cause: GC pauses from high heap pressure, or disk latency during segment merges and page cache misses.
Fix: Reduce shard count; fix mappings to avoid fielddata explosions; cap aggregations; ensure sufficient RAM for OS cache; move to better disk.
3) “Cluster health is yellow and I’m panicking”
Symptom: Single-node OpenSearch shows yellow status.
Root cause: Replicas configured > 0 with only one node.
Fix: Set replicas to 0 for those indices and rely on snapshots for recovery.
4) “Indexing made search unusable”
Symptom: During bulk imports, search latency spikes and timeouts occur.
Root cause: Refresh too frequent, merges dominating disk, indexing competing with search on the same node.
Fix: Increase refresh interval during bulk loads; throttle ingest; run bulk indexing off-peak; consider separate nodes for indexing and serving (even if that means “don’t do it on one VPS”).
5) “OpenSearch ran fine until the disk filled”
Symptom: Writes fail, cluster goes read-only, weird errors.
Root cause: Not accounting for merge overhead, snapshots, translog growth, and retention. No disk watermarks planning.
Fix: Keep significant free space; enforce retention; monitor disk; test snapshot sizes; avoid massive reindex jobs without headroom.
6) “Search results are missing obvious matches”
Symptom: Users insist the data exists; search doesn’t find it.
Root cause: Analyzer/mapping mismatch (keyword vs text), tokenization surprises, stopwords/min token length behavior, or stale index due to failed sync pipeline.
Fix: Verify analysis with _analyze; inspect mappings; implement robust indexing pipeline with retries and dead-letter handling; build consistency checks between MySQL and OpenSearch.
7) “We made heap bigger and it got worse”
Symptom: Fewer GCs but longer pauses; query latency worsens; disk reads increase.
Root cause: Heap too large starves OS page cache; Lucene benefits from filesystem cache.
Fix: Keep heap moderate; leave RAM for page cache; fix shard/mapping issues rather than hiding them under heap.
Checklists / step-by-step plan
Plan A: Stay with MySQL (and be honest about requirements)
- Define search features explicitly. Keyword only? Prefix? Filters? Relevance tuning? Typos? If you need typos and synonyms, stop pretending.
- Audit queries. Replace
%term%patterns where possible; ensure filters and sort orders have indexes. - Try FULLTEXT where it fits. Especially for “search in title/body” cases.
- Measure with slow query log. Keep a stable threshold (1–2s) and review top offenders weekly.
- Capacity plan. Ensure DB has enough buffer pool and disk headroom; isolate from other noisy workloads.
- Set expectations. Document relevance limitations in product terms: “exact words, not typos,” “no synonyms,” “no stemming.”
Plan B: Use OpenSearch, but don’t make your VPS the lab rat
- Start with a single-node dev instance. Validate analyzers, mappings, and basic queries.
- Design the document model. Denormalize. Avoid joins. Decide what is “text” vs “keyword.”
- Implement indexing pipeline. Use a queue/outbox pattern so MySQL writes and index updates are decoupled and retryable.
- Use aliases for zero-downtime reindex. Version indices and swap aliases.
- Define snapshot strategy. Schedule snapshots and test restores.
- Set guardrails. Cap query complexity, limit aggregation sizes, and add timeouts.
- Observe everything. Heap, GC, disk latency, thread pool rejections, query latency distribution.
- Only then consider self-hosting on a small VPS. If you can’t meet stability goals, you’re not “saving money,” you’re financing incidents.
Plan C: If you insist on OpenSearch on a VPS, do these minimums
- Give it enough RAM to leave room for OS cache (don’t allocate all memory to heap).
- Use fast, consistent storage. If your VPS disk has burst credits, assume you will hit the ceiling.
- Set replicas to 0 on single node, and rely on snapshots.
- Keep shard count low. You’re not a cluster; don’t cosplay as one.
- Throttle bulk indexing and increase refresh interval during imports.
- Monitor and alert on: disk > 80%, heap > 85%, thread pool rejections > 0, query p95, and OOM events.
- Run a restore drill before you need it.
FAQ
1) Can MySQL FULLTEXT replace OpenSearch for a small site?
Yes, if “small” also means “simple.” Keyword search across a couple fields with basic filtering is fine. The moment you need typo tolerance, synonyms, and faceting, you’ll feel the ceiling.
2) Is OpenSearch overkill for a VPS?
Often, yes. Not because it won’t run, but because it won’t run reliably without enough RAM and disk performance. On a small VPS, you’re one merge storm away from a bad day.
3) Why does OpenSearch need so much memory?
It’s Lucene plus a JVM plus caches plus per-shard overhead. Also, Lucene depends heavily on the OS page cache for fast reads. If you starve the OS cache, search gets disk-bound quickly.
4) Should I set OpenSearch heap to 75% of RAM?
Not on a VPS where you care about performance. You need memory for the OS page cache. If heap is too big, you trade “GC problems” for “disk problems,” and neither is charming.
5) What’s the biggest hidden cost of adding OpenSearch?
Synchronization and correctness. MySQL is the source of truth; OpenSearch is an index. You need a pipeline, retries, backfills, and a way to detect drift.
6) Can I run MySQL and OpenSearch on the same VPS?
You can, but you probably shouldn’t for production. Disk contention is the usual failure mode, followed by memory pressure. If you do it anyway, isolate resources and monitor disk latency like your job depends on it.
7) Why is my OpenSearch cluster health yellow on single node?
Replicas can’t be allocated because there’s only one node. Set number_of_replicas to 0 for those indices, then lean on snapshots for recovery.
8) When does it make sense to stay with MySQL even if relevance isn’t perfect?
When uptime and simplicity matter more than search sophistication, and your users can tolerate “exact word” search. Many products quietly live here and do just fine.
9) What’s the cleanest migration path from MySQL search to OpenSearch?
Build the indexing pipeline first, backfill into a versioned index, validate results, then switch reads gradually (or by feature flag). Keep the ability to fall back to MySQL search temporarily.
Next steps you can actually do this week
If you’re still deciding, do this in order:
- Write down the required search features. Not “nice to have.” Required.
- Run the diagnostics. Check disk latency, swap, and CPU steal. If the VPS is unhealthy, stop and fix that first.
- Prototype in MySQL. Try FULLTEXT or structured-prefix patterns. Measure with slow logs and EXPLAIN.
- Prototype OpenSearch relevance on a dev box. Validate analyzers/mappings with
_analyze. Profile your slowest queries. - Decide based on ops reality. If you can’t commit to snapshots, monitoring, and restore drills, avoid self-hosting OpenSearch on a small VPS.
- If you choose OpenSearch anyway: keep shard count low, set replicas to 0 on single node, throttle indexing, and monitor heap + disk like it’s production—because it is.
The best self-hosted search setup is the one that survives a bad day. You don’t need a perfect relevance model if the service is down. And you don’t need a beautiful cluster if it eats your VPS the moment traffic becomes real.