Key Takeaways
- Mutable models should be self-validating and implement a validation interface.
- Consider using immutable models when sharing objects, especially across threads.
- Consider single and multi-level undo support for MVVM style UIs.
- Avoid unnecessary memory allocations when implementing property change notifications.
- Don’t override the equality and hash code methods in models.
Traditional MVC, MVP, MVVM, Web MVC: the common element in every UI pattern is the model. And while there are many articles discussing the view, controllers, and presenters in these architectures, almost no thought is given to the models. In this article we’ll look at the model itself and the .NET interfaces that they implement.
To ensure clarity, I would like to start by defining some terms. You may see more precise definitions in other articles, but these will serve our purpose.
Data Model
This is an object or object graph that incorporates both data (i.e. properties and collections) and behavior. Data models are the main focus of this article.
Data Transfer Object (DTO)
A DTO is an object or object graph that only contains properties and collections. A true DTO has no behaviors whatsoever and are almost never immutable.
Though it strains the definition slightly, some simple interfaces such as INotifyPropertyChanged are often supported by DTOs that were created using code generation.
Object Graph
An object graph consists of an object and all of the child objects that can be reached from it. When discussing data models and DTOs, object graphs are one-way tree-like structures. (Circular graphs are possible, but they play havoc on serialization frameworks.)
Domain Model
A domain model is the higher level concept that describes a set of related data models.
Entity
There are many definitions for the term “entity”, some of which are essentially the same as “data model”. Since the popularization of nHibernate and Entity Framework, the term has roughly come to mean a DTO that is mapped one-to-one with a database table.
Under this definition, the entity may be decorated with attributes to more exactly describe the column/property mappings. It may also support lazy-loading of child collections from the database.
While an entity can be expanded to serve the role of a full data model, it is more common for the entity to be mapped to a separate data model or DTO prior to applying any business logic.
Business Entity
Not to be confused with an ORM’s entity, this is another way of referring to a data model.
Immutable Objects
An immutable object is one that has no property setters or methods that allow it to be visibly altered. An immutable object is not a data model itself, but it may appear in one to represent static lookup data. Because they cannot be altered, it is safe to share a single immutable object across multiple data models.
Data Access Layer (DAL)
For the purpose of this article, the DAL encompasses any service objects, repositories, direct database calls, web service calls, etc. Basically anything used to interact with external dependencies such as data storage.
Characteristics of a Data Model
A real data model is deterministically testable. Which is to say, it is composed only of other deterministically testable data types until you get down to primitives. This necessarily means the data model cannot have any external dependencies at runtime.
That last clause is important. If a class is coupled to the DAL at run time, then it isn’t a data model. Even if you “decouple” the class at compile time using an IRepository interface, you haven’t eliminated the runtime issues associated with external dependencies.
When considering what is or isn’t a data model, be careful about “live entities”. In order to support lazy loading, entities that come from an ORM often include a reference back to an open database context. This puts us back into the realm of non-deterministic behavior, where in the behavior changes depending on the state of the context and how the object was created.
To put it another way, all methods of a data model should be predicable based solely on the values of its properties.
Message Passing Between Parent and Child Objects
Parent and child objects often need to communicate with each other. When done incorrectly, this can lead to tightly cross-coupled code that is hard to understand. To simplify matters, follow these three rules:
- Parent objects may directly interact with properties and methods on child objects.
- Child objects may only interact with parent objects by raising events.
- No object may directly interact with sibling objects. Messages must always pass through a shared parent.
With this design, you can always cleave off a child object and test it without its parent object. The test itself can monitor for events otherwise handled by the parent object.
Validation – The Only Must-Have Feature of a Data Model
I am going to talk a lot about optional features a data model may implement. But before we get into those, I should first discuss the one feature every data model must address: validation.
It is almost impossible to have a data model that isn’t dealing with unclean data. If the model is populated by a file, external application, or UI, chances are there will be ways to introduce disallowed or inconsistent values. The UI is especially problematic, as the user is often expected to fill out a form one field at a time.
Given these constraints, you cannot use exceptions in the constructors and property setters like you would for other types of classes. Instead you need a validation interface that provides some flexibility as to when error checking occurs.
Out of the box, .NET offers a few validation interfaces. But each has its own challenges.
IDataErrorInfo
The IDataErrorInfointerface has been available since the beginning. However, it has been largely deprecated because it rather hard to use. Let us consider its properties.
string Error { get; }: This property serves three roles:
- Reporting object level errors
- Reporting all property level errors
- Indicating if no errors exist by returning an empty string.
string this[string columnName] { get; }: This indexer property returns property specific errors.
As you can see, the Errorproperty is doing far too much. It is cramming everything into one string, making it impossible to distinguish between object-level and property-level validation errors. And if you were to redefine it to only cover object level errors, then you would lose the ability to tell you whether or not the object as a whole has any errors.
As for the indexer, how would you even call it? The only way to even access to it is by casting the object into a IDataErrorInfovariable
. And then, few people would expect code like this:
var nameError = ((IDataErrorInfo)customer)["Name"];
If your UI framework requires this interface, I suggest you bake it into a base class that exposes a more sensible validation API. Once connected to your real validation logic, you can ignore the fact IDataErrorInfo even exists.
INotifyDataErrorInfo as Formally Defined
I am going to discuss the INotifyDataErrorInfo
interface twice. In this section I’ll explain how the INotifyDataErrorInfo
was intended to be used. Then in the next section I’ll explain how I think it should actually be used.
The INotifyDataErrorInfo
interface was designed to support asynchronous validation in Silverlight 4. The basic idea was modifying a property would trigger a service call. Eventually that service call would complete and the error status would be updated.
The only property on this interface is bool HasErrors { get; }
. There isn’t really guidance on how to implement this property. We’ve got two basic options, neither of which works.
- Block until the asynchronous validation, which hangs the UI.
- Return immediately. This makes the call non-deterministic, as you don’t know whether or not there is a pending asynchronous validation request.
For general display purposes, you can work around this by updating HasErrors
property whenever the event EventHandler<DataErrorsChangedEventArgs> ErrorsChanged;
is raised. However, if you are trying to synchronously check the validation status in response to the “Save” button, that won’t be an option.
Furthermore, the ErrorsChanged
could theoretically be fired twice for each property change: once immediately and once when any asynchronous validation is complete. This could have weird UI effects as HasErrors
toggles between two states.
Finally there is the IEnumerable GetErrors(string propertyName);
method. This method is used to validate properties. However, you can pass a null or empty string to it in order to get the object-level validation errors.
The fact it returns an IEnumerable
instead of an IEnumerable<ValidationResult>
makes it look like a C# 1 interface rather than something post-dating generics.
But the lack of type safety isn’t the only problem. Consider this passage from the documentation:
This method returns an IEnumerablethat can change as asynchronous validation rules finish processing. This enables the binding engine to automatically update the user interface validation feedback when errors are added, removed, or modified.
If this method was returning an IObservable
, maybe that would work. But the only way for an IEnumerableto work in this scenario is if it blocks while waiting for the asynchronous validation to complete. And again, that will cause the UI to hang.
Then there is the whole encapsulation issue. As we discussed earlier, data models are supposed to be completely free of any external dependencies. Changing a property should not directly invoke a service call, as is makes the class very hard to test. If you need to verify something asynchronously, do it in a controller, presenter, or view-model.
INotifyDataErrorInfo as It Should Be Used
Despite its flaws, INotifyDataErrorInfo
is used in enough UI frameworks that you can’t ignore it. Fortunately we can redefine its contract without breaking compatibility.
The HasErrors
property is updated synchronously when a property is changed. If the class implements INotifyPropertyChanged
and the value changes, the PropertyChanged
event is raised.
The ErrorsChanged
event should be raised whenever a specific property becomes valid or invalid. If the object-level validation has changed, ErrorsChanged
should be raised with a null or empty string for the property name.
Under the new model, GetErrors
should always return a collection class that supports IEnumerable<ValidationResult>
. The ValidationResult
class provides useful information such as which properties are part of the validation warning. This comes into play for error messages such as “At least one of First Name/Last Name is required”.
Attribute Based Validation
Though not always suitable, you can accomplish a lot using attribute based validation. This works by putting a subclass of ValidationAttributeon individual properties. Here are some examples,
CreditCardAttribute
EmailAddressAttribute
EnumDataTypeAttribute
FileExtensionsAttribute
PhoneAttribute
UrlAttribute
MaxLengthAttribute
MinLengthAttribute
RangeAttribute
RegularExpressionAttribute
RequiredAttribute
StringLengthAttribute
To create your own validation attribute class, you simply need to override the IsValid
method. Normally this is used for single-property validation, but you can gain access to the other properties on the object via the ValidationContext.
One advantage of attribute based validation is some frameworks such as ASP.NET MVC/WebAPI honor it with the need for implementing any validation interfaces. And since it is declarative, you can even share validation logic with the UI.
Mixing Imperative and Attribute Based Validation
While in theory you can do everything using validation attributes, sometimes it is easier to do validation in a strictly imperative fashion using normal code. Reasons you would do this include:
- The validation rule concerns multiple properties
- The validation rule concerns child objects
- The validation rule won't be reused for other classes or properties
One downside of imperative validation is it is server-side only. There is no way to automatically share validation logic with the UI like you can do with attribute based validation.
Another limitation is imperative validation requires the use of a shared interface so the rest of the application has a consistent way of triggering said validation.
The Blank Form Problem
The blank form problem occurs when the user is creating a new record and hasn't filled out all of the required fields yet. When the form is first displayed, you don't want to see every field highlighted in red.
To solve this problem cleanly, the model needs two additional methods:
- Validate: Perform validation across all fields, triggering rules such as "required".
- Clear Errors: Removes all triggered validation errors from the object
Under this model, the model object starts in the clear state. If it was partially populated before being displayed to the user, Clear Errors should be invoked before it is shown to the user.
As each field is touched by the user, only that field will be validated. Then as part of the Save routine, Validate can be called to force a full check of the model, including properties not touched by the user.
Theoretical Validation Interface
Here is what I think the validation interface for .NET should have looked like.
public interface IValidatable
{
/// This forces the object to be completely revalidated.
bool Validate();
/// Clears the error collections and the HasErrors property
void ClearErrors();
/// Returns True if there are any errors.
bool HasErrors { get; }
/// Returns a collection of object-level errors.
ReadOnlyCollection<ValidationResult> GetErrors();
/// Returns a collection of property-level errors.
ReadOnlyCollection<ValidationResult> GetErrors(string propertyName);
/// Returns a collection of all errors (object and property level).
ReadOnlyCollection<ValidationResult> GetAllErrors();
/// Raised when the errors collection has changed.
event EventHandler<DataErrorsChangedEventArgs> ErrorsChanged;
}
You can see an implementation of this interface in the Tortuga Anchor library.
IValidatableObject
I would be remiss if I didn’t briefly discuss the IValidatableObject
interface. This interface only has a single method defined as IEnumerable<ValidationResult> Validate(ValidationContext validationContext)
;.
There are some things I like about this method. Because it can trigger a full validation of the object, it solves the blank form problem. And it returns ValidationResult
objects, which I greatly prefer over raw strings.
The downside is it accepts a ValidationContext
object. This is class practically nobody knows how to use. Here are the properties on ValidationContext
.
- DisplayName: Gets or sets the name of the member to validate.
- Items: Gets the dictionary of key/value pairs associated with this context.
- MemberName: Gets or sets the name of the member to validate.
- ObjectInstance: Gets the object to validate.
- ObjectType: Gets the type of the object to validate.
- ServiceContainer: Gets the validation services container.
There isn’t any real guidance on how to use these properties. For example, when if ever should you set the MemberName property? What does the DisplayName property actually do? What should be stored in the Items dictionary and when is it accessible during validation.
The documentation goes on to say it “enables custom validation to be added through any service that implements the IServiceProvider interface”. However, it doesn’t say what types the IServiceProvider.GetService(Type) method needs to support so this capability can’t be leveraged.
All in all, the ValidationContext
class wants to do everything but due to a combination of bad API design and virtually no documentation, it succeeds at nothing. And since no UI Framework utilizes this interface, there is no reason to support it or the IValidatableObject
interface.
Property Change Notifications
While useful in many situations, property change notifications are most commonly associated with the MVVM design pattern. Exposed via the INotifyPropertyChangedinterface, these notifications allow the models to notify any associated UI elements the underlying data has been modified. This is what allows you to do interesting things like update a model in background process or share one model across multiple views.
The laziest way to implement property change notifications is to simply raise them every time a property setter is invoked. While it technically works, there are some performance ramifications.
public string Name
{
get { return m_Name; }
set
{
m_Name = value;
PropertyChanged?.Invoke(this, new PropertyChangedEventArgs(nameof(Name)));
}
}
In the above example, every property change notification allocates a new object just to hold the property name, even if nothing is listening for it. If these notifications occur frequently enough, this could trigger otherwise unnecessary garbage collection cycles. To avoid this, really the PropertyChangedEventArgs object should be cached.
The other issue is the event is not necessarily needed. If the value wasn’t actually changed, you could be triggering a screen redraw for no reason. So a simple check is in order, giving us:
static readonly PropertyChangedEventArgs NameProperty = new PropertyChangedEventArgs(nameof(Name));
public string Name
{
get { return m_Name; }
set
{
if (m_Name == value)
return;
m_Name = value;
PropertyChanged?.Invoke(this, NameProperty);
}
}
Writing this can be quite tedious, so “MVVM frameworks” have been created to reduce the noise. A Get and Set method are used in conjunction with an internal dictionary to maintain state. In this way, the PropertyChangedEventArgs caching and value changed check are handled for you. Specifics vary, but they all more or less look like this example from Tortuga Anchor.
public string Name
{
get => Get<string>();
set => Set(value);
}
Note there is a performance cost for this convenience. Accessing the internal dictionary is slower than using fields and boxing of values may eliminate the gains from caching PropertyChangedEventArgs.
If you are only writing server-side code, you may be thinking “I don’t have a UI so I don’t need these”. If so, you are probably correct. But there are times where using the INotifyPropertyChangedinterface allows for simplifying some otherwise complex code. I would recommend server-side developers at least consider it as an option.
INotifyPropertyChanging
This twin of INotifyPropertyChangedis fired just before a value is changed. Its purpose is to allow the consumer to cache the previous value. ORMs such as LINQ to SQL and Entity Framework may use this information for some change tracking strategies.
ISupportInitialize / ISupportInitializeNotification
The purpose of ISupportInitializeis to temporarily disable property/collection change notifications, error validation, and the like. To use it, call BeginInitbefore making a batch of property changes.
When you call EndInit, a single “everything changed” property change notification can be sent. This is done using a PropertyChangedEventArgsobject with an empty or null property name.
If you wish to be notified when initialization is complete, the ISupportInitializeNotificationinterface adds an Initializedevent and IsInitializedproperty.
Collection Change Notifications
Just as we need to know about changes to individual properties, we need to know when whole collections have changed. This is solved by the INotifyCollectionChanged
interface.
Unfortunately INotifyCollectionChanged
is far less powerful than the interface implies. In theory, the associated CollectionChanged
event can use a single event to tell you when whole sets of objects have been added or removed from the collection. In practice, this isn’t done because of a design flaw in WPF.
The most well-known implementation of INotifyCollectionChanged
is ObservableCollection<T>
. This class was designed to fire a separate CollectionChanged
event for each item added or removed. When WPF was created, it assumed that you would always use ObservableCollection<T>
and thus WPF doesn’t support situations where NotifyCollectionChangedEventArgs.NewItems
has more than one item.
As a result of this mistake, no one can implement INotifyCollectionChanged
with batch update support unless they are 100% sure the collection class won’t be used in a WPF setting.
With this in mind, my recommendation is to not bother trying to create a custom collection class from scratch. Just use ObservableCollection<T>
or ReadOnlyObservableCollection<T>
as a base class and layer any additional functionality you need on top of it.
A Type Safe Collection Changed Event
Besides having functionality that isn’t used, another problem with the INotifyCollectionChanged
interface is that it isn’t type-safe. If you are in a context where types matter, you have to either perform a (theoretically) unsafe cast or write code to handle a situation that should never occur. To deal with this, I propose also implementing this interface:
/// <summary>
/// This is a type-safe version of INotifyCollectionChanged
/// </summary>
/// <typeparam name="T"></typeparam>
public interface INotifyCollectionChanged<T>
{
/// <summary>
/// This type safe event fires after an item is added to the collection no matter how it is added.
/// </summary>
/// <remarks>Triggered by InsertItem and SetItem</remarks>
event EventHandler<ItemEventArgs<T>> ItemAdded;
/// <summary>
/// This type safe event fires after an item is removed from the collection no matter how it is removed.
/// </summary>
/// <remarks>Triggered by SetItem, RemoveItem, and ClearItems</remarks>
event EventHandler<ItemEventArgs<T>> ItemRemoved;
}
This not only solves the type safety issue, it also eliminates the need to check the length of NotifyCollectionChangedEventArgs.NewItems
Property Change Notifications in Collections
Another of the “missing interfaces” in .NET is the ability to detect when a property of an item in a collection has changed. Let’s say, for example, you have an OrderCollection
class. And you need to display a TotalPrice
property on the screen. In order for this property to remain accurate, you need to know when the price on any individual item has changed.
For my own collections, I often expose a INotifyItemPropertyChanged
interface. This relays any PropertyChanged
events from objects in the collection to a single ItemPropertyChanged
event.
For this to work, the collection needs to attach and remove event handlers when objects are added to/removed from the collection.
Change Tracking and Undo
Though not used as often as it should, .NET has interfaces specifically for tracking changes in objects and even providing undo capabilities.
Change Tracking
On the surface, the IChangeTracking interface seems like it is easy to understand: either the object has changes or it doesn’t. But it is actually a little more subtle.
From a UI perspective, what the user usually wants to know is “Has this object, or any of its child objects, been altered?”.
From a data storage perspective, you want to know the answer to that question for the object itself, excluding any child objects which will be saved separately.
The documentation is no help here, as it doesn’t define whether or not a child object is considered to be part of the “object’s content”. My personal preference is to say IsChanged
is inclusive of child objects and add a separate IsChangedLocal
property for data storage.
Revertible Change Tracking
Building on this is IRevertableChangeTracking
. This adds a RejectChanges
method to undo any pending changes. The same question about whether this applies to just the local object or the child object applies here.
Again, I generally assume RejectChanges
walks down the object graph rejecting all pending changes. This can be a bit tricky when collection properties are involved, but it is better to have this encapsulated in the class than to try to build an ad hoc solution.
Editable Object
Unlike IChangeTracking
, IEditableObject
is exclusively used for UI scenarios. Specifically, dialogs and data grids that offer Ok/Cancel semantics.
Before displaying the dialog or switching the data grid to edit mode, BeginEdit
must be called to take a snapshot of the object. EndEdit
clears the snapshot, while CancelEdit
reverts the object back to that state. Note most data grids invoke these calls automatically for you.
If you have both IEditableObject
and IRevertableChangeTracking
, I recommend implementing this as a two-level undo with IEditableObject
being the second level. Or in other words, calling RejectChange
simplicitly calls CancelEdit
but not vice-versa.
The Missing Property Changed Interface
Continuing our theme of “missing interfaces”, there is glaring opportunity for ORM integration. We can use IChangeTracking
to tell the ORM whether or not a given record needs to be saved. But there is no interface telling us which properties have been changed. This means ORMs need to either track changed fields separately, or just assume everything changed and resend the whole object to the database.
Equality, Hash Codes, and IEquatable
This is one feature set I would recommend avoiding. By our definition, data models are mutable. If they weren’t, none of the previous interfaces would make any sense.
The problem is you cannot safely implement GetHashCode and Equals using mutable properties. Dictionaries assume hash codes never change, so if that object ever finds itself acting as a dictionary key then you’ll break the dictionary.
Furthermore, what does equality actually mean for a data model? They represent the same row (i.e. primary key) in a database table? Or every property is the same between the two objects? No matter how you answer this question, someone else on your team is bound to answer differently.
If you feel you must have non-default equality or hash code implementation, consider creating an IEqualityComparer<T>
. This lives outside of the data model, so it is much easier for others to understand you are doing something non-standard.
Likewise, you may wish to provide one or more Comparer<T>
classes for specialized sorting.
ICloneable
It is well known you shouldn’t implement the ICloneable
interface because it was never specified if the clone is supposed to be shallow or deep.
That doesn’t mean you should never offer a Clonemethod. Rather, if you choose to offer a Clone method you should be very explicit about what is being cloned. Maybe even call it ShallowClone
or DeepClone
as appropriate.
Concluding Thoughts
Models are foundation upon which the rest of the application is built and understood. The time you spend on preventing cracks such as inconsistent naming conventions, missing features, and incorrectly implemented interfaces will be repaid many times over.
About the Authors
Jonathan Allen got his start working on MIS projects for a health clinic in the late 90's, bringing them up from Access and Excel to an enterprise solution by degrees. After spending five years writing automated trading systems for the financial sector, he became a consultant on a variety of projects including the UI for a robotic warehouse, the middle tier for cancer research software, and the big data needs of a major real estate insurance company. In his free time he enjoys studying and writing about martial arts from the 16th century.