PostgreSQL Gets Better NoSQL Capabilities
PostgreSQL 9.4 Beta comes with the much-anticipated "binary JSON" type, JSONB. This new storage format for document data is higher-performance and comes with indexing, functions and operators for manipulating JSON data.
The JSONB type is a confluence of two projects - HStore and JSON. JSONB has everything JSON had, but is more efficient in storage due to the binary representation, and faster due to indexing. Eventually, all current HStore and JSON users are expected to move to JSONB.
Why would NoSQL features be important for PostgreSQL, when it's traditional user base has been developers needing solid relational capabilities or users switching from enterprise databases such as Oracle? Josh Berkus, one of the core team members, shares some insights.
The Oracle market, while presently large, is not growing, at all. The actual population of Oracle users who can defect to PostgreSQL is even smaller than that. No matter how many features we ever add, we can never be better at being Oracle than Oracle can.
Innovative technologies succeed because they reach new users and grow the market.
The other comments in the same thread also make it clear how several startups using PostgreSQL with technologies such as Node, Python, Go or Ruby are looking forward to using fast JSON support.
Considering these points, it is evident that first-class, performant support for JSON without compromising on it's traditional strengths such as reliability, can help PostgreSQL grow into new use cases.
There are still areas where other NoSQL databases could be better fit - for e.g. Bruce Momijan, a core team member and Senior Database Architect at EnterpriseDB, suggested (via email) -
Even as PostgreSQL expands in functionality and capability, there are still some edge cases where a NoSQL solution might be a better fit. Cassandra, being columnar, is good for unstructured data with lots of duplication, like log files, and it’s good at adding new nodes. Postgres will remain out front, however, for applications that can benefit from NoSQL functionality but require ACID compliance too.
Robert Haas, a major contributor and Senior Database Architect at EnterpriseDB added -
One example is if you're using MapReduce for highly parallel query processing. In that case, you might be better off with something like Hadoop. Another example would be if your data model is going to require lots of small changes to large JSON objects. Some NoSQL solutions may be able to handle this case more efficiently.
Other than specific uses cases, however, for developers needing flexibility between schema-less and relational storage for different types of data in the same application, as well as ACID guarantees, Postgres is beginning to make a compelling case.
Note that JSONB is not the only major features introduced in 9.4. There are several others -
- Data Change Streaming API allows decoding and transformation of the replication stream. This lays the foundation for new replication tools that support high-speed and more flexible replication and scale out solutions.
- Materialized views with "refresh concurrently"
- ALTER SYSTEM SET, enables modifying postgresql.conf from the SQL Command Line and from remote clients, for easier administration
As well as several minor features -
- Dynamic Background Workers
- Replication Slots
- Write Scalability Improvements
- Several performance improvements, such as for aggregates, smaller GIN indexes, reductions in WAL volume
- Updatable security barrier views
- New Array manipulation and table functions
- Time-delayed standbys
- MVCC system catalog updates
- Decrease lock level for some ALTER TABLE commands
- Backup throttling
- WITHIN GROUP
You can see the release notes for a full list of changes.