Splinter ⚡ A Vector Anti-Database & Shared-Memory Substrate

Splinter is a minimalist, lock-free key-value manifold designed to facilitate high-frequency data ingestion and retrieval across disjointed runtimes. It is built on the belief that for local inter-process communication (IPC), the kernel’s networking stack is an expensive and unnecessary coupling.

Splinter emerged out of frustration resulting from attempting to stretch tools over gaps that they simply were never designed to cover. It wasn't a question of more tuning; it was a need to cut out the socket layer and kernel arbitration completely.

It was either completely dismantle and re-imagine SQLite, or write something completely different. Given the sparse availability of options, different seemed most beneficial to both the current need as well as the current ecosystem.

Design Philosophy: Low Complexity + Systemic Sympathy = Speed!

Modern software has become complacent with IAAS marketing, assuming that CPU cycles and memory bandwidth are infinite. We invoke help from the kernel's socket layer to transfer a value that we already have in memory to another region in the same physical memory as standard practice.

And now we're doing that with 768-dimensional vectors 😱. Splinter is a gesture back in the direction of efficiency for systems development. Here are the core tenets that set it apart (aka: why Splinter is so damn fast):

The "Good Process Neighbor" Approach

(Even though technically only the CLI or client code is the process because Splinter itself is just a place, not a process)

Splinter assumes informed intent. It does not try to outsmart the kernel with O_DIRECT or complex paging logic. It provides the metadata (ctime, atime, epoch) and the memory region, then gets out of your way. It is a tool for engineers who would rather spend their thermal budget on the math, not the management.

Relational databases attempt to shield themselves from the kernel by trying to be the kernel. Splinter goes out of its way to not bother the kernel unless it must, and its logic shards inform the kernel of how the memory is intended for use at every step of the way. Here's more about why Splinter and Linux are great friends.

Supported Platforms & Linkage

Splinter is designed to work on any modern GNU/Linux flavor. Windows users can use WSL with a slight penalty. MacOS requires some questionable shimming around the lack of memfd (forcing anonymous file descriptors to work), but it should otherwise function perfectly.

Optional Linkage (Enable during build):

Quick Start / Building & Installing

If you want to build Splinter with "everything", just clone the repo, enter libsplinter/ and just type make. This will configure a build with -DWITH_NUMA=ON, -DWITH_LUA=ON, -DWITH_EMBEDDINGS=ON, -DWITH_LLAMA=ON (for inference) and -DWITH_RUST=ON for bindings.

This also means you need to have all of those prerequisites installed and want to build with them. If you want to pick and choose yourself, or enable nothing at all, then do this:

git clone git@bitbucket.org:tinkertim/libsplinter.git

cd libsplinter
mkdir build
cd build

cmake -D{your flags} ..
ctest --output-on-failure
sudo -E make install

The -E option tells Sudo to preserve your environment, which is required if you're running install targets with Rust enabled. Once you have installed, you can verify with splinterctl --version.

From there, check out the CLI and then the C API and bindings. You can find examples of most functions in splinter_test.c if the doxygen-style comments aren't enough.

Comparison With Related Tools

It's not entirely fair to compare active databases to Splinter because they are competing with their "hands tied" by mutexes and socket layers. Splinter deliberately eschews any calculation on-write that it can't do with simple bitwise math; Splinter tries to stay boring.

Feature Splinter Traditional Vector DBs
Transport memfd() that degrades gracefully to mmap() (L3 Speed) TCP/gRPC (Network Stack)
Daemon None (Passive) Active Service (Heavy)
Footprint Static & Deterministic Dynamic & Volatile
Complexity ~ 875 Lines of obsessively-optimized C (Will never exceed 999) 100k+ Lines of Code

Think about what you can do once TypeScript, Rust, Python and Go can all share the same address space and embeddings safely, without socket or even memcpy() overhead instead :).

The Main Splinter Use Cases (What's it Good For?)

Splinter can be anything from a simple configuration store to a Rank-2 tensor model scaffold. It's designed for vector-heavy workflows like Artificial Intelligence (AI) inference or high-resolution physics and linguistic research.

1. High-Res Physics Of All Kinds

Do you like vectors? Of course you do.

Splinter was built around the idea of capturing raw data exceptionally well while making backfill easy. It allows up to 64 signal groups per bus, ctags-style labeling, and built-in per-slot Bloom filters. Slot coupling allows for simple standard ordered sets (e.g. foo_key.1, foo_key.2 for velocity and acceleration). You can record high-frequency data at L3 speeds without hardware aliasing, and have many keys in tandem with vectors all representing a single fraction of a second, if you have the room.

Splinter was built primarily around GDELT consumption, to set expectations. High-rank tensors? No problem, that's routine in the author's use and splinter makes sharing them fast and safe.

2. The Semantic Hippocampus (LLM Orchestrated Memory)

Splinter functions remarkably well as semantic short-to-long-term memory for Large Language Models. LRU-based movement helps "forget" ephemera quickly while making sure stuff that actually matters (as viewed by access time and epoch) settles into long-term memory. You can run inference directly on the bus, accessing embeddings using Splinter's supervised raw pointers so operations require zero memcpy().

Plus, well, "psychic search" being fundamental to design is quite attractive for such a feature. Inference (on the bus) via .gguf is included.

3. Configurations, Registries & Edge Caching

Splinter's epochs and feature flags lend very well to application configuration on Linux systems. You can also compile Splinter to simply ignore embeddings (WITH_EMBEDDINGS=0) and use it as a local, socket-less cache server. (The author uses Splinter to trickle into Redis based on key activity).

4. Embedded IoT Use

Splinter is great for environmental loggers or system ring buffers because its static geometry is vastly superior for flash-based storage than relational databases. At just 875 lines of code, it stays in the "hot path" for most modern edge processors.

Exhaustive Feature Overview

1. Performance & Scale

2. Vector & Math Native

3. Mechanical Hygiene (Auto-Scrubbing)

Splinter offers three levels of sanitation to balance data integrity with computational thermodynamics:

4. The Signal Arena

5. Sidecars & Loadable Shards

Contact The Author

I'm Tim Post (former Stack Overflow Employee & Community Leader). You can reach me at timthepost@protonmail.com if you have questions.