That sounds like a practical take on LLM memory — especially the block-level deduplication part.
Most “memory” layers I’ve seen for AI are either overly complex or end up ballooning storage costs over time, so a content-addressed approach makes a lot of sense.
Also curious — have you benchmarked retrieval speed compared to more traditional vector DB setups? That could be a big selling point for devs running local research workflow
elpocko 2 hours ago [-]
>block-level deduplication (saves 30-40% on typical codebases)
How is savings of 40% on a typical codebase possible with block-level deduplication? What kind of blocks are you talking about? Blocks as in the filesystem?
retreatguru 7 hours ago [-]
How do you use this in your workflow? Please give some examples because it’s not clear to me what this is for.
rkunnamp 5 hours ago [-]
Thank you for sharing this. Sorry for a possible noob question. How are embedding generated? Does it use a hosted embedding model? (I was trying to understand how is semantic search implemented)
(seems like there's some vague future plans for models like all-MiniLM-L6-v2, all-mpnet-base-v2)
pbronez 12 minutes ago [-]
Hmm I wonder how much that effects the compression benefits of block level duplication. The mock embeddings choose vector elements from a normal distribution, so it’s far from uniform
izabera 41 minutes ago [-]
not trying to be a hater but how is 100mb/s high performance in 2025? that's as performant as a 20 years old hdd
A4ET8a8uTh0_v2 1 hours ago [-]
I like it and I will be perusing your code for what could be used in my 'not yet working' variant.
huqedato 3 hours ago [-]
In my RAG I use qdrant w/ Redis. Very successfully. I don't really see the use of "another memory system for LLM", perhaps I'm missing something.
Although I developed it explicitly without search, and catered it to the latest agents which are all really good at searching and reading files. Instead you and LLMs cater your context to be easily searchable (folders and files). It’s meant for dev workflows (i.e a projects context, a user context)
I made a video showing how easy it is to pull in context to whatever IDE/desktop app/CLI tool you use
I see stuff like this, and I really have to wonder if people just write software with bloat for the sake of using a particular library.
SJC_Hacker 6 hours ago [-]
Blame the committee for refusing to include basic functionality like regular expressions , networking and threads as part of the STL
menaerus 6 hours ago [-]
The reason for depending on Boost in this repo is just few search characters away - he needs HTTP/WebSocket implementation and Boost.Beast provides it. The actual bloat here in this repo is conan.
airstrike 2 hours ago [-]
This feels like a shallow dismissal, which is frowned upon per the HN guidelines
noodletheworld 6 hours ago [-]
? Are you complaining about MCP or boost?
It’s an optional component.
What do you want the OP to do?
MCP may not be strictly necessary but it’s straight in line with the intent of the library.
Are you going to take shots at llama.cpp for having an http server and a template library next?
Come on. This uses conan, it has a decent cmake file. The code is ok.
This is pretty good work. Dont be a dick. (Yeah, ill eat the down votes, it deserves to be said)
pessimizer 7 hours ago [-]
Boost is a nearly 30 year old open source library that provides stuff for C++ that most standard libraries for other languages already have out of the box. You seem to think that it is hipster bullshit rather than almost a dinosaur itself.
sitkack 7 hours ago [-]
How would you use the built in functionality to enable graph functionality? Metadata or another document used as the link or collection of links?
The tool has built-in versioning. Each file gets a unique SHA-256 hash on storage (automatic versioning), you can update metadata to track version info, and use collections/snapshots to group versions together. I have been using the metadata to track progress and link code snippets.
mempko 9 hours ago [-]
Wicked cool. Useful for single users. Any plans to build support for multiple users? Would be useful for an LLM project that requires per user sandboxing.
winterrx 10 hours ago [-]
The domain listed on the GitHub repo redirects too many times.
blackmanta 9 hours ago [-]
That should be fixed now. It was a misconfiguration of CloudFlare SSL with GitHub Pages.
Most “memory” layers I’ve seen for AI are either overly complex or end up ballooning storage costs over time, so a content-addressed approach makes a lot of sense.
Also curious — have you benchmarked retrieval speed compared to more traditional vector DB setups? That could be a big selling point for devs running local research workflow
How is savings of 40% on a typical codebase possible with block-level deduplication? What kind of blocks are you talking about? Blocks as in the filesystem?
(seems like there's some vague future plans for models like all-MiniLM-L6-v2, all-mpnet-base-v2)
https://github.com/jerpint/context-llemur
Although I developed it explicitly without search, and catered it to the latest agents which are all really good at searching and reading files. Instead you and LLMs cater your context to be easily searchable (folders and files). It’s meant for dev workflows (i.e a projects context, a user context)
I made a video showing how easy it is to pull in context to whatever IDE/desktop app/CLI tool you use
https://m.youtube.com/watch?v=DgqlUpnC3uw
I see stuff like this, and I really have to wonder if people just write software with bloat for the sake of using a particular library.
It’s an optional component.
What do you want the OP to do?
MCP may not be strictly necessary but it’s straight in line with the intent of the library.
Are you going to take shots at llama.cpp for having an http server and a template library next?
Come on. This uses conan, it has a decent cmake file. The code is ok.
This is pretty good work. Dont be a dick. (Yeah, ill eat the down votes, it deserves to be said)