Building Scalable Applications in .NET: Introducing the FatDB Distributed Computing Platform
Building modern, scalable applications can be a daunting task. Consumers expect high performance and rapid response times with “always available” access to their data. NetOps expects the application to be easy to configure, maintain, migrate and troubleshoot. The development team wants familiar paradigms, simple APIs, good tooling and to leverage existing staff and the legacy code base. Finally, businesses demand a competitive advantage: They want an application to be cost-effective, achieve quick time to market (TTM), and be able to rapidly adapt to evolving business strategies. In the past, it was almost guaranteed that an enterprise would sacrifice a great number of these needs in the race to deploy an application. Even today, it’s common to find many disparate technologies duct-taped together into a byzantine edifice that tends to be brittle, slow and overly complicated. The current explosion of new Cloud-, PaaS- and NoSQL-related technologies than can handle many of these shortcomings is no coincidence; the blunt reality is that “doing it right” and meeting the needs of stakeholders is hard and requires abandoning old patterns and adopting new ways of thinking.
The FatDB Solution
About six years ago, Wilshire Media, the sister company of FatCloud, built some of the largest streaming media Web properties at the time for companies like CBS Radio, AOL and Yahoo. The development of these applications shared nearly identical characteristics and challenges. We found that we were doing the same things over and over and the projects never seemed to get any easier. We needed to come up with a solution that would:
- Communicate between nodes in a flexible and fault tolerant way.
- Store, retrieve and query data and files.
- Cache data.
- Parallelize and eliminate network latency wherever possible for extreme loads.
- Perform both synchronous and asynchronous multithreaded processing of frequently changing business logic.
- Add scale or reconfigure a cluster with absolute minimal downtime.
- Automate configuration, monitoring and maintenance of large clusters.
- Manage very aggressive time frames where TTM was paramount.
There came the realization that we could capitalize and leverage our previous work as well as fulfill stakeholder needs and reduce sacrifices in the process. Solving those problems required one simple concept: to abstract and generalize the essential characteristics of any enterprise application into a proven, reusable, performant, fault-tolerant, integrated and distributed platform. The fulcrum of our approach is what we now call our Mission Oriented Architecture or MOA. This generic framework has since evolved into what is known today as the FatDB platform technology. About two years ago, the FatDB technology got its first major test; Cricket Wireless engaged in a major mobile music initiative called Muve music that today serves over 1million subscribers. The Muve backend is built almost entirely on the FatDB platform and runs on hundreds of servers. Since then, we have taken this technology and formed a company around it called FatCloud. The number of different applications that can be created on top of the FatDB platform is limitless and the range of enterprises using the FatDB platform spans many verticals. Our mission is to simplify the complexity and cost around enterprise development by providing this technology to the Microsoft development community. We want to help our customers avoid making the same mistakes that we initially did – by providing them with a set of powerful and mature building blocks, which they can use to radically improve the way they bring Web scale applications to market, and, to solve every challenge.
So what are the building blocks that compose the FatDB platform? The core product can be split into nine areas:
- Core Foundation – This provides all the basic plumbing for everything in the FatDB platform.
- FatDB / Cache – A flagship NoSQL database and memory cache service.
- FatFMS – A distributed file management service.
- FatWQ – An asynchronous batch job queuing and processing service.
- FatApps – A Software Development Kit (SDK) and architecture for synchronous Windows Communication Foundation (WCF) based services
- FatProcessors – An SDK and architecture for providing asynchronous batch job processing routines that integrate into the FatWQ.
- FatDB Management Studio (FatDBMS) – A Graphic User Interface (GUI) that allows users to configure, inspect, monitor, query, deploy and maintain their servers, data, and business logic in each of their environments (development, staging, production, etc.)
- SQL Integration – A functional subset of FatDB, FatWQ, FatFMS and FatDBMS that allows users to preserve and enhance any investment in SQL Server, in a few powerful ways, by leveraging FatDB’s core services.
- Map/Reduce type analytics – A functional subset of FatDB, FatWQ and FatDBMS that enables the performance of data mining or Online Analytical Processing (OLAP) type analytics, which can help distill vast amounts of data into actionable business metrics.
Core Foundation: The Plumbing
The core foundation is composed of a set of APIs, which are available to all users; meaning everything needed to build our core services such as FatDB or FatFMS, can be used by our customers to build their own services. The basic functions of this core foundation are as follows:
- Portable application host – FatCloud is an application host that runs as a Windows service. The communication endpoints are implemented using WCF to provide a standard and familiar paradigm for Microsoft developers. Business logic WCF services can be exposed and hosted as first-class citizens within the FatDB platform cluster. Additionally, the FatDB platform server software can be run, sandboxed or hosted on any virtual or physical windows server – it can also be moved to a public IaaS, a company’s own datacenters, or a hybrid of both.
- Discovery – The cluster self-aggregates with server discovery and broadcasting via either User Datagram Protocol (UDP) multicast where available, a discovery service broker such as Azure, or Transmission Control Protocol (TCP) based strategies including “gossip.”
- Proxy and Service Layers – All services in FatCloud are partitioned into two layers: the proxy layer and the service layer. The proxy layer routes calls to the appropriate servers and aggregating results and the service layer does the actual work: it executes business logic or stores and retrieves data. By cleanly separating the two concerns, we have facilitated deployment strategies that will help us eliminate bottlenecks and spread the load via the proxy layer while keeping the “heavy lifting” on the service layer where it should be.
- Unified routing – The ability to route all service calls in a unified fashion coupled with the application hosting capabilities of the FatDB platform is one of its key product strengths and unique differentiators. FatDB relies on a 20 byte SHA hash key to provide a locator that is inherently “collision proof” and evenly distributed within any one of our sub-cluster groups. This concept is at the heart of FatCloud’s MOA. And, each sub-cluster, called a group, is partitioned into shards and mirrors for availability and partition tolerance. The servers are arranged in a grid, with each mirror functioning as a primary for even load distribution unlike “ring” or “master-slave” architectures in related technologies. This structure allows for excellent scalability, fault tolerance, and responsiveness with no “hot spots.”
- Unified API – The FatDB platform comes with a set of APIs that are exposed so that the developer can construct FatApp services and FatProcessors with the same functionality as its core services. This includes all service call access strategies used for data consistency such as “quorum” access, parallel access and sequential access, as well as the hooks and callbacks for anti-entropy and maintenance mechanisms. The list is both extensive and comprehensive. For example, imagine creating business logic that relies on “quorum” access and executes on the server where the data resides by merely providing a one-line callback and some skeleton code in the proxy!
FatDB / Cache: The NoSQL Database
FatDB is our flagship service. It is a NoSQL, eventually consistent, multi-model database that has several key features:
- Flexible data consistency – There is a variety of data consistency strategies that the customer can choose for each database they create. Consistency levels can range from full to weak and will soon have global consistency strategies for multi-datacenter scenarios.
- Data model version tolerance – Data in FatDB can either be stored as opaque blobs or transparent objects. We utilize ProtoBuf as our serialization mechanism for transparent objects. Since we retain metadata for every object version there is a great deal of tolerance in the object model. Put simply, down the road users will be able to add and delete fields from an object definition without the fear of breaking existing client code or running mass data conversion operations.
- Locking – FatDB provides a facility to lock the object level, which ensures one at a time updates and works at a strong consistency level.
- Distributed LINQ – LINQ is used as the query syntax and fields can be indexed in object definition. If a LINQ query refers to a field that has not been indexed, it will invoke an operation similar to a table scan in SQL server. Also, it has the ability to index opaque blobs at a rudimentary query capability. Our LINQ implementation currently supports projections, sorting and paging and will also support joins in the near future!
- Caching – FatDB is, at its core, a memory cache with optional disk based persistence. It can be used to define a sliding window expiry or an absolute window expiry on any database. (Note: At this time, LINQ queries are not yet supported on a memory only (non-persisted) cache.)
- SQL integration – FatDB can help unload the SQL server for data that is read heavy and suitable for migration. FatDB can be a high-performance persisted repository for migrated SQL data. FatDB can also be a memory-only cache with read-through and write-through in front of SQL server. And it can act as a distributed web session cache in lieu of SQL server.
- Map/Reduce – By using distributed LINQ, simple map/reduce queries can be affected. This allows users to create basic mapper and reducer statements and executes them for rudimentary data analytics across a group.
FatFMS: The File Management Service
FatFMS is our distributed file management system. At its heart, FatFMS consolidates all the direct attached storage in a sub-cluster and treats it as one large hard drive. There are a few aspects of FatFMS that are noteworthy.
- Multiple copies – With the FatFMS, a user can store any number of copies of a given item across different hard drives.
- Flexible retrieval – Users can retrieve files by UNC list, URL list or stream.
- Metadata tag and search – Users of FatFMS can tag files with metadata and search for those files matching a given set of tags.
- SQL integration – FatFMS can help take the load off of SQL by storing large objects as physical files rather than keeping them in the SQL server database.
FatWQ: The Asynchronous Work Queue
The FatWQ is a high-speed asynchronous batch processing job work queue and execution engine that is backed by FatDB for all the goodness it brings. FatWQ works directly with simple FatProcessors that a user provides to execute individual steps of the job graph. Whatever a developer can code in C# can be exposed as a job processor. The philosophy of asynchronous “divide and conquer” is at the core of effective use of the technology.
Examples of possible usages are:
- Processing incoming leads for a sales team.
- Valuing stock portfolios for a brokerage firm.
- Transcoding media files.
- Fraud detection of financial transactions.
- Facial recognition processing from video feeds.
- Rendering animation frames.
- Calculating the Nth digit of Pi.
- Ingesting and normalizing sensor data packets.
- Data mining or Map/Reduce processing.
- Calculating massive collaborative filters
FatWQ also provides some powerful features:
- Backed by FatDB – Because it uses FatDB as a backing store, it can reliably process jobs on the SAME node that they are stored on with the same data consistency guarantees.
- Data locality – Because it uses a unified routing mechanism, it can potentially process the jobs on the SAME node where the input data resides, resulting in a much higher degree of performance than it would otherwise achieve with a strict SOA approach.
- Composable job graphs – Individual job steps can be composed in a directed acyclic graph structure that executes in perfect synchrony, which can provide the ability to execute complex work flows with heavy parallelism.
- Priority based scheduling – Jobs can be prioritized in the queue.
- Job processors – Users can write their own job processors, which can handle one or more types of job steps.
- Standalone queue – FatWQ can be used as an MSMQ replacement if a user wishes to offload the job processing to some other entity or use the job wrapper in an entirely new way.
- SQL integration – FatWQ can effectively offload workflow management from SQL server based approaches, which tend to be slow and brittle.
FatApps and FatProcessors: The Application Platform
An essential part of any enterprise application is its business logic. The FatDB Platform provides a facility to host a user’s WCF services as first-class, which are tightly integrated to the core. By exposing business logic in this way, a user gains all the functionality of the underlying platform. All the plumbing and heavy lifting for scale, routing and redundancy become an integral part of the service without any effort from the user. These capabilities are highly valuable – When the app hosting capabilities of the FatDB platform are coupled with the other core services, a user would see how an integrated platform could deliver synergy. For synchronous request-response traffic, the user could simply write a traditional WCF business logic service and host it as a FatApp. For asynchronous processing, they would create an assembly containing their business logic class that implements a very simple interface with two methods. Both FatApps and FatProcessors would be deployed using the FatDBMS into the sub-cluster group and automatically installed on all the correct machines. Then, the entire enterprise application can be written this way. This is the new way of thinking… FatCloud builds, deploys and scales aspects of the enterprise applications at the sub-cluster or group level as a cohesive unit; data and processing are melded into one.
Bringing it all together: The Mission Oriented Architecture
Application architecture has evolved from everything packed into large, monolithic boxes to highly distributed services running with maximum separation of concerns. This is the essence of a Service Oriented Architecture (SOA) and it mirrors concepts found in Object Oriented Programming (OOP). The concept behind separation of concerns is an admirable one because it isolates functional units and promotes loose coupling. However, just as in OOP, the separation of concerns mantra can go too far… there must be a balance between complexity and performance. Three costly examples resulting from the failure of SOA (as it is commonly practiced today) as a viable model include:
- Performance – If data is isolated from processing, it will incur extra network latency in storing and retrieving…period. In many cases, this latency debt can accumulate to the point of making certain high throughput; low latency use cases fail spectacularly or it will lead to a need to find a scale-up solution, which can cost an unnecessary amount of capital. The need to combine processing and data is exactly why stored procedures were invented.
- Configuration and communication – The “a-la-carte” mentality is all the rage these days. Enterprises want to combine the technology from different vendors and make it all work together. What is most often overlooked is the simple fact that there are inevitable incompatibilities between components and, in order to make all the components play nice, the user will have to write a host of adapters and abstraction layers. Without an integrated approach, they are likely to have a hodgepodge of components thrown together – a plethora of communication strategies, and a nightmare of configuration files. Maintaining, changing, or troubleshooting such an inelastic system will be hard and inevitably require a few extremely high-level engineers to maintain it.
- Scaling and pivots – There probably will come a time when an enterprise application must scale to meet increased demand. Using the traditional “a-la-carte” approaches, the app must scale each component individually. This can be a real pain for NetOps, as it requires many manual changes to servers and configurations. Why can’t servers just be added to get greater capacity? Another potentially worse problem is “painting ourselves into a corner.” Often, business logic assumptions are baked into the architecture and it can be very costly for the development team to pivot when business needs change. A lot of time and effort has been expended to glue together the components and it will take time to un-glue and re-glue the pieces together into a different configuration.
Much of this can be alleviated by thinking in terms of clustering customers, data and business logic into cohesive groups that are defined by the mission that they perform. Let’s use an e-commerce site as an example: Some components that need to be managed are the user accounts, wish-lists, likes, dislikes, order history, etc., so let’s define a “user group,” which contains all of the business logic and most of the data that it’s likely to consume. We also know the product catalog and ingestion work that we have to do to integrate with vendor feeds… so let’s define a group that contains that data and business logic. Then there is a need to handle orders and fulfillment so this can be defined as their own groups as well. Financial transactions and reporting will be another group, but it will utilize the SQL Server, as it excels in that area.
By defining these groups and clustering the data with the business logic, it can immediately reap a performance advantage. Additionally, scaling and pivots become much easier as users simply add servers to the group for more capacity or swap out FatApps or FatProcessors when they pivot. Finally, because FatDB has built the data on a standard platform that unifies routing, scaling, configuration, discovery and fault tolerance, it can concentrate on the business logic and not get burned by the 90-90 rule (i.e. the last 10 percent takes another 90 percent of the development time! – credit: Tom Cargill, Bell Labs.) This is the FatCloud vision of the future of distributed computing, which is our version of MOA.
FatDB Management Studio: The Tooling Perspective
One of the biggest requests we hear from customers is the need for great tooling. Day-to-day management of environments requires several things, which FatDB has provided in a user-interface with a familiar paradigm (it looks a lot like SQL Management Studio.) Existing and planned features are:
- Manipulation of basic elements – Users can create, view, configure, update and delete all the basic elements in the cluster’s environments, groups, servers, data stores, file stores, queues, FatApps, FatProcessors and server instances. Drag and drop enables easy restructure and group organization of servers, data, and business logic apps and processors.
- Inspect and query – Customers can inspect, edit, and query data stores, file stores, and queues using the LINQ syntax.
- Monitor, maintain, scale and pivot – Users can monitor the real time status of their environments, groups and servers, and perform various maintenance tasks. They can also add scale or reconfigure groups and environments in real time without being forced to take a system outage window.
- Establish SQL server and code integration bindings – Users can easily integrate with SQL server and relieve bottlenecks by establishing FatDB as a cache and/or importing data from and exporting data to SQL server. As a cache, FatDB can also manage the user’s data models and import/export data transfer objects (DTOs) for an Object Relational Mapping (ORM) set of functionality. SQL Server Integration Services (SSIS) packages are also available for bulk operations.
- Map/Reduce type analytics – Users can run Map/Reduce analytics via simple LINQ statements or by visually creating dynamic job step graphs. This is one of the most exciting and powerful features of the FatDBMS, which will see lots of activity in the coming months.
- Additional features – The roadmap binds with public IaaS providers such as Amazon and Azure. Additionally, the FatDBMS will be exposed as a Visual Studio add-in.
The FatDB platform offers a full ecosystem for developing modern distributed .NET applications. FatCloud provides all the basic building blocks that an enterprise needs to create a complete solution for either new “greenfield” development or for enhancing and evolving legacy investments in existing architecture, hosting environment, hardware and software. By adopting the MOA mindset, an enterprise can build an application that can be both more successful than traditional methods or SOA approaches and easier to relocate to a different datacenter or IaaS, configure, maintain, scale and pivot. Furthermore, costs, risks, sacrifices, and TTM can be radically reduced. FatCloud is one of the first of many integrated, distributed computing application hosting platforms that will revolutionize the way that web scale application development is done today.
About the Author
Justin Weiler’s talent for designing, architecting and developing enterprise software began at an early age, building his first program when he was just 12 years old. It’s this passion that led him to deliver some of largest-scale media properties for clients including, CBS Radio, AOL, Yahoo and Cricket Wireless. Through these fascinating but challenging projects, Justin was able to take with him the lessons he learned and become CTO at FatCloud and primary contributor to the Windows platform designed to accelerate large distributed and cloud computing projects. You can download FatDB for free here or you can send question about FatDB at firstname.lastname@example.org.
Oliver Wegner, Stefan Tilkov Jul 20, 2014