What Is beava?
beava is a Rust-backed real-time feature server built by beava-dev for fraud detection, ad-tech, and behavioral analytics teams, and it is one of the best Real-Time Feature Servers tools for fraud detection, ad-tech, and behavioral analytics teams. The project advertises 684,812 sustained events/sec on a single Apple M4 core in its benchmark notes, with event ingestion over HTTP or TCP and feature reads served at sub-millisecond latency. It replaces the usual pile of Redis counters, Postgres triggers, and batch jobs with one binary and a Python-first schema layer.
The core idea is simple: push events in, maintain per-entity aggregates atomically, and query feature tables immediately. That makes beava a fit for online scoring, velocity checks, anomaly rules, and behavioral signals where stale data is expensive.
Quick Overview
| Attribute | Details |
|---|---|
| Type | Real-Time Feature Servers |
| Best For | Fraud detection, ad-tech, and behavioral analytics teams |
| Language/Stack | Python SDK, Rust server, HTTP, framed TCP, JSON, msgpack |
| License | Apache-2.0 |
| GitHub Stars | N/A as of Feb 2026 |
| Pricing | Open-Source |
| Last Release | N/A |
beava is a stateful event-processing server, not a generic cache. The page describes a single-binary system with a Python decorator API, a TCP fast path, WAL-backed durability, and snapshot recovery, which makes it closer to a specialized online feature store than to a plain key-value database.
Who Should Use beava?
- Fraud engineers building velocity rules, per-user counters, and device reputation checks that need up-to-date features on every request.
- Ad-tech teams running click, impression, and conversion scoring where per-entity aggregation must stay in sync with streaming events.
- Platform engineers who want a single server process instead of stitching together Redis, Kafka consumers, and batch ETL.
- Indie hackers and CTOs who need a local-first dev loop, a small operational surface, and a Python API that stays readable under load.
Not ideal for:
- Teams that need built-in multi-tenant auth, TLS termination, or hosted SaaS controls in the same binary.
- Workloads that require long-range analytics over billions of rows, ad hoc SQL, or warehouse-style joins.
- Organizations that already rely on an established feature store and only need a thin online cache layer.
Key Features of beava
- Atomic per-entity feature updates — beava computes counters, velocities, distances, rates, and distributions on every event, then writes them atomically for each entity key. That matters for fraud pipelines because a read after a push sees a consistent state, not a partially updated counter set.
- Dual transport surface — the server exposes HTTP/JSON on port 8080 for debugging and framed TCP on port 8081 for the fast path. TCP uses a compact big-endian frame with strict FIFO ordering per connection, which keeps request/response correlation simple without a
request_idfield. - Python table decorators — the SDK lets you declare event types and feature tables in Python with decorators such as
@bv.eventand@bv.table. That makes the schema readable, keeps business logic close to the model, and avoids hand-written glue between the application and the server. - Durability with WAL and snapshots — beava persists writes to a write-ahead log and takes periodic snapshots, so a restart can recover state in seconds. The docs also call out refusal of network filesystems for storage, which is the right trade-off when fsync behavior has to stay predictable.
- Embed mode and external server mode — the client can connect to
http://localhost:8080, or it can spawn an embedded server withbv.App()for local development. That reduces the gap between a quick prototype and a production deployment. - Purpose-built aggregation primitives — the project claims 50+ aggregation primitives, including
count,n_unique, and other feature ops that map directly to streaming fraud rules. That is a better fit than stuffing custom state logic into Redis Lua scripts. - Benchmark-oriented design — the repository includes a dedicated benchmark harness and documents a 60-second sustained throughput run using TCP and msgpack. The stated single-core result is useful because it gives you a real ceiling for capacity planning rather than a synthetic microbenchmark.
beava vs Alternatives
| Tool | Best For | Key Differentiator | Pricing |
|---|---|---|---|
| beava | Online fraud and behavioral features | Single-binary feature server with Python decorators and TCP fast path | Open-Source |
| Redis | General-purpose caching and simple counters | Broad ecosystem and battle-tested primitives, but custom feature logic needs Lua or app code | Open-Source / Paid Cloud |
| Feast | Feature store orchestration | Strong offline/online feature-store workflow and provider integrations | Open-Source |
| Materialize | Streaming SQL pipelines | SQL-native incremental views for event streams | Paid / Open-Source |
Pick Redis when you already have cache-heavy infrastructure and only need a few counters or sets. Pick Feast when the bigger problem is feature management across offline and online stores, not just the online serving path.
Pick Materialize when your team wants SQL over streams and can model the problem relationally. Pick beava when you want direct event ingestion, per-entity state, and sub-millisecond reads without building a custom stream processor.
If you need observability around the event pipeline, pair beava with OpenTrace so you can inspect request flow and latency before features are queried. If the same product also needs downstream analytics storage or reporting, DataHaven is a sensible companion for the colder path.
How beava Works
beava uses a stateful event model where you register event schemas and derived tables, then push events into the server as they happen. Each incoming event updates one or more per-entity feature rows, and those rows are addressed by a key such as user_id, which keeps read access simple and predictable.
The architecture is intentionally direct: a Python layer defines the schema, the Rust server executes the state transitions, and the transport layer accepts either JSON or msgpack over HTTP or framed TCP. The documentation spells out strict FIFO per connection, big-endian framing, and a compact opcode table, which is exactly the kind of low-level detail you want when every microsecond matters.
curl -fsSL https://raw.githubusercontent.com/beava-dev/beava/main/scripts/install.sh | sh
beava --data-dir ./.beava/
curl -X POST localhost:8080/register -d '{...schema...}'
curl -X POST localhost:8080/push -d '{"event":"Click","data":{"user_id":"alice","page":"/home"}}'
curl -X POST localhost:8080/get -d '{"table":"UserActivity","key":"alice"}'
That flow installs the binary, starts the server with durable storage, registers the schema, pushes an event, and reads back the derived features. In practice, you should expect to define your event and table classes in Python first, then use either the HTTP endpoint for inspection or the TCP listener for production traffic.
Pros and Cons of beava
Pros:
- Very low read latency for online features, with the project positioning reads at sub-millisecond speed.
- Single-binary deployment keeps the operational footprint smaller than a broker-plus-consumer-plus-store stack.
- Python authoring model makes schema and feature definitions easy to review in code review.
- Durable state via WAL and snapshots reduces the risk of losing fraud-critical counters on restart.
- Binary and JSON wire support gives you a debug-friendly path and a higher-throughput path.
- Benchmark transparency is better than average because the repo documents the benchmark command and transport settings.
Cons:
- No built-in TLS in v0, so you need a proxy like nginx, Envoy, or Cloudflare at the edge.
- No built-in auth in v0, which means private-network deployment is a hard requirement.
- In-memory-first sizing can get expensive for very large entity counts, especially when you keep rich feature packs in RAM.
- Narrower scope than a full feature store, so it will not replace warehouse orchestration or offline training pipelines.
- Early-stage ergonomics still show through in the CLI and config surface, which means operators may need to read the docs closely.
Getting Started with beava
The fastest path is the shell install script or the Docker image, then a local server start with a data directory. That gives you the same beava binary either way, which is useful when you want to test the exact runtime you will deploy.
# install the latest release binary
curl -fsSL https://raw.githubusercontent.com/beava-dev/beava/main/scripts/install.sh | sh
# or run the container
# docker run -p 8080:8080 -p 8081:8081 beavadev/beava:edge
# start with persistent storage
beava --data-dir ./.beava/
# optional in-process demo
beava quickstart
After startup, the server listens on 127.0.0.1:8080 for HTTP and 127.0.0.1:8081 for TCP unless you override the addresses. If you are prototyping, beava quickstart is the fastest way to see event registration, event push, and feature retrieval without wiring a full app.
Verdict
beava is the strongest option for real-time fraud and behavior feature serving when you need atomic per-entity updates, Python-defined schemas, and a single process that can ingest and serve state quickly. Its biggest strength is the tight event-to-feature loop; its main caveat is the missing v0 TLS/auth layer. Choose beava if you want online features without a separate broker-and-store stack.



