BT

LinkedIn Open Sources PalDB, a Read-only Key-value Store

| by Abel Avram Follow 9 Followers on Oct 27, 2015. Estimated reading time: 1 minute |

LinkedIn has open sourced PalDB, an embeddable read-only key value store, 8 times faster than LevelDB and taking several times less memory than a hashset.

PalDB is an write-once key-value store written in Java and open sourced by LinkedIn. After the store is created all operations against it are read only. Its purpose is to to improve read operations and lower the memory footprint. LinkedIn recommends it for storing side data. They define side data as “the extra read-only data needed by a process to do its job. For instance, a list of stop words used by a natural language processing algorithm is side data.”

PalDB is embeddable, it does not use a schema and keeps data in one binary file. It offers random data access via an API.

Being optimized for read operations, its performance is comparable to other in-memory data structures such as HashMap or HashSet, according to LinkedIn, but takes significantly less space in memory, one of the main benefits the company was looking for when was designing it. For example, a 100M keys hashset needs over 500MB while PalDB takes about 80MB. Or, 35M member IDs need 1.8GB of RAM in a hashset compared to 290MB for PalDB. Data can be compressed in PalDB using Snappy for even a smaller footprint.

In terms of speed, a test performed by LinkedIn shows that PalDB does 2M reads/s or 6 times faster than HashSet and 8 times faster than LevelDB or RocksDB, on a MacBook Pro 3.1 GHz and a 10M-keys index.

PalDB was optimized for memory access. Keeping the data on a disk will result in considerably poorer performance. While there is no limitation for the size of data, the size of the index is limited to 2GB. Also, it is important to know that PalDB is not thread safe.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT