Per entry overhead? #1894

javafanboy · 2025-08-29T16:28:38Z

javafanboy
Aug 29, 2025

I developed a small prgram that tries to measure the per key-value pair overhead of some caches and it seemed like that of Caffeinie is quite high - almost 110 bytes!
I thought this sounds high - in particular given what metadata is kept I would have thought it would need less than other caches - am I measuring this wrong or could it be correct that it is this high??

Coherence LocalCache ≈ 81 bytes
CaffeineCache ≈ 108 bytes
Guava Cache ≈ 70 bytes
HashMap ≈ 38 bytes

ben-manes · 2025-08-29T18:18:09Z

ben-manes
Aug 29, 2025
Maintainer

I have a small program that does similar, see the Memory Overhead docs. It uses an older estimation technique whereas a more accurate would be to use Java Object Layout (that is used via build tasks) and the most accurate estimate would come with JEP-8249196.

A few important things to note:

Java memory layout is aligned to machine word boundaries
Caffeine uses code generation of specialized classes per configuration to avoid unused fields (e.g. timestamp if TTL is not used)
Caffeine maintains lazily initialized, secondary data structures (timer wheel, countmin sketch, ring buffers). These are separate from the entry but might be in your overhead counts.
Many caches use customized hash tables to inline their metadata onto the map's entry, but concurrent hash tables have become very complex due to being performance sensitive. Caffeine instead wraps the value to benefit from hash table improvements, avoid bugs or being pinned to a buggy fork, and does so with penalty of a little extra overhead.
An on-heap cache is not often massive since there is object bloat, GC penalty, etc. That leads to layered caches (heap, off-heap, remote) which can trade-off serialization, compression, etc. costs to apply at the proper tier. That means the Caffeine cache might be moderately sized where its hit rate and performance are most critical to avoid missing to L2/L3/SOR loads, but is small enough that the object/gc penalties are not as important. On the flip side large remote caches like memcached are less concerned with hit rates as they often require 99% hits for SLAs (e.g. Twitter) and the network/serialization costs dominate any data structure performance, so they care most about capacity planning to minimize infrastructure costs (maximize entries per mb, aggressively expire to reclaim space, scale enough to saturate the network link). All of this is to say the cache's design has to optimize for its target audience since the tradeoffs change significantly based on the layer it is being used at.

0 replies

javafanboy · 2025-08-29T18:40:36Z

javafanboy
Aug 29, 2025
Author

Thanks for the reference to this documentation I was not aware it existed - you really have excellent and detailed docs for this project!

…

On Fri, Aug 29, 2025 at 8:18 PM Ben Manes ***@***.***> wrote: I have a small program that does similar, see the Memory Overhead <https://github.com/ben-manes/caffeine/wiki/Memory-overhead> docs. It uses an older estimation technique whereas the most accurate would be to use Java Object Layout <https://github.com/openjdk/jol> (that is used via build tasks <https://github.com/ben-manes/caffeine/blob/master/gradle/plugins/src/main/kotlin/analyze/object-layout.caffeine.gradle.kts>). The most accurate estimate would come with JEP-8249196 <https://openjdk.org/jeps/8249196>. A few important things to note: 1. Java memory layout is aligned to machine word boundaries 2. Caffeine uses code generation of specialized classes per configuration to avoid unused fields (e.g. timestamp if TTL is not used) 3. Caffeine maintains lazily initialized, secondary data structures (timer wheel, countmin sketch, ring buffers). These are separate from the entry but might be in your overhead counts. 4. Many caches use customized hash tables to inline their metadata onto the map's entry, but concurrent hash tables have become very complex due to performance sensitive. Caffeine instead wraps the value to benefit from hash table improvements, avoid bugs or being pinned to a buggy fork, and does so with penalty of a little extra overhead. 5. An on-heap cache is not often massive since there is object bloat, GC penalty, etc. That leads to layered caches (heap, off-heap, remote) which can trade-off serialization, compression, etc. costs to apply at the proper tier. That means the Caffeine cache might be moderately sized where its hit rate and performance are most critical to avoid missing to L2/L3/SOR loads, but is small enough that the object/gc penalties are not as important. On the flip side large remote caches like memcached are less concerned with hit rates as they often require 99% hits for SLAs (e.g. Twitter) and the network/serialization costs dominate any data structure performance, so they care most about capacity planning to minimize infrastructure costs (maximize entries per mb, aggressively expire to reclaim space, scale enough to saturate the network link). All of this is to say the cache's design has to optimize for its target audience since the tradeoffs change significantly based on the layer it is being used at. — Reply to this email directly, view it on GitHub <#1894 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADXQF7YWO4GOGPVSTPLH333QCKPLAVCNFSM6AAAAACFFA3EY6VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIMRVHA2TKNQ> . You are receiving this because you authored the thread.Message ID: ***@***.***>

1 reply

ben-manes Aug 31, 2025
Maintainer

Since that documentation was from 2016, I updated it against the latest version and added JOL into the comparison. A small fix was to clear a cache after it was initially populated in order to ensure the per-entry wasn't miscalculated by lazy initialized side structures. The results are probably reasonable guesses.

javafanboy · 2025-08-31T06:56:12Z

javafanboy
Aug 31, 2025
Author

Thanks for the updated metrics - looks like Caffeine is doing quite well!

…

On Sun, Aug 31, 2025, 03:32 Ben Manes ***@***.***> wrote: Since that documentation was from 2016, I updated it against the latest version and added JOL into the comparison. A small fix was to take the baseline of the cleared cache after it was initially populated once in order to ensure the per-entry wasn't miscalculated by lazy initialized side structures. The results are probably reasonable guesses. — Reply to this email directly, view it on GitHub <#1894 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADXQFYIE7E23TM6OA3JFZT3QJGDFAVCNFSM6AAAAACFFA3EY6VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIMRWGU2DEOA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Per entry overhead? #1894

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Per entry overhead? #1894

Uh oh!

javafanboy Aug 29, 2025

Replies: 3 comments · 1 reply

Uh oh!

Uh oh!

ben-manes Aug 29, 2025 Maintainer

Uh oh!

javafanboy Aug 29, 2025 Author

Uh oh!

Uh oh!

ben-manes Aug 31, 2025 Maintainer

Uh oh!

javafanboy Aug 31, 2025 Author

javafanboy
Aug 29, 2025

Replies: 3 comments 1 reply

ben-manes
Aug 29, 2025
Maintainer

javafanboy
Aug 29, 2025
Author

ben-manes Aug 31, 2025
Maintainer

javafanboy
Aug 31, 2025
Author