The Weak Event Pattern is Dangerous

Summary

This article describes the motivation for the “weak event” pattern and explains some dangers associated with using it. Briefly: a weak event manager will not be able to differentiate between subscribers that are active and “in use” and subscribers that are out of scope, garbage collectable but not yet garbage collected. Such subscribers will have their event handler (or message handler) invoked even after they become garbage collectable (but not yet collected) and that can cause unexpected side effects.

Source code demonstrating this issue can be found at https://github.com/ladimolnar/Samples/tree/master/Sources/WeakEvents

The birth and death of a .NET object

The life cycle of an object in .NET has a quite a few complicated details. Fortunately, in most situations, these details can be ignored and a much simpler model can be assumed. There are some cases however when the complexity of what goes on under the hood in .NET matters. In those cases, not knowing the details can surprise you in ways that will negatively impact your application. We are going to look at one of those cases and it has to do with zombies.
When I think about the life cycle of a .NET object (a reference type) this is the simplified model that I normally have in mind:

  • The object is instantiated using the new keyword.
  • The object goes through a period when it is in use. That means the object can be accessed from somewhere in the code. Even if you don’t do anything with it, as far as .NET is concerned, if the object can be accessed by your code then it is in use.
  • The object is no longer in use. That means the object can no longer be accessed from any place in the code. Think a local variable that goes out of scope or a class instance after there are no longer any references pointing to it.

Zombies

If you want to go beyond this simple model, the best way to learn more is to learn about the garbage collector in .NET. Any good resource on that subject will teach you what you need to know about the life cycle of an object. There is one particular aspect of that life cycle that we’ll look at. There is a time interval after an object is no longer accessible from anywhere in the code and before the memory for that object is reclaimed by the framework.  Reclaiming the memory for that object is done by the Garbage Collector. In other words, there is a time interval after the object becomes garbage collectable and before the object is actually garbage collected. In that time interval the object still exist but it is not quite “alive”. An object in this state is sometimes referred to as a zombie object.

It is common that an object exists as a zombie for only a short period of time. That could be just until the Garbage Collector gets a chance to run next time. However, there is no set rule for how often the Garbage Collector runs. Basically it runs “when needed” and what that means is influenced by numerous factors. Also, just because the Garbage Collector runs, it does not mean that it will collect all garbage collectable (zombie) objects. The garbage collector makes a compromise between the need to keep the application memory set small and the need to run the garbage collection process as fast as possible. Some zombies will survive multiple garbage collections. In some cases, a zombie can last for a very long time – even for the remaining life of the application.

Resurrecting zombies

Why should you care about zombies? You shouldn’t of course. After all, you can no longer access the zombie object and as far as you are concerned the object is “gone”. That is, unless you are in possession of some magic that you can use to resurrect it. Then you should care.
That magic exists in the form of “weak references”. Sometimes that magic will be invoked on your behalf without you even realizing it and that is when complications can follow.
A weak reference in .NET is implemented by a class called WeakReference. A WeakReference can be used to reference another object in a way that will not prevent the garbage collector from reclaiming that object. If the referenced object becomes a zombie, the weak reference can resurrect it. Resurrecting a zombie means creating a regular (strong) reference to it. Since now there is a new reference to that object, as far as the Garbage Collector is concerned, the object is “in use” and no longer garbage collectable. Of course, after being resurrected, if all references to that object “go away” again, then the object will become a zombie again.

Why would you ever want to use a WeakReference and resurrect zombies? Without going into a lot of details here are a few reasons why you may want to take advantage of the class WeakReference:

  • You can implement a caching system using weak references. The idea is that you have an object that is expensive to create, uses a lot of memory and is seldom used by the application. Once created, you can provide access to that object via a weak reference. If the application is subjected to memory pressure this system will allow the memory used to be reclaimed. If the application is not subjected to memory pressures the object can be made available later without incurring the cost of instantiating it again.
  • You can instrument your code and setup a memory leak detector based on weak references.
  • You can implement the weak event pattern using weak references.

Classic events and memory leaks

Let’s leave zombies aside for a moment; we’ll get back to them soon. Let’s talk about the most common cause of memory leaks in .NET: the incorrect use of classic events. Any time an object C (client) subscribes to an event provided by another object S (service), object S will obtain a reference to object C. Any time when you write code like this:

service.SrvEvent += this.MyHandlerMethod;

the compiler generates code that will result in the event “SrvEvent” holding a reference to “this”. That makes sense, otherwise the event would not know how to invoke the correct event handler.

In many cases object C is an object with a short life span like a page or a control and object S is an object with a long life span like a service that lives throughout the entire duration of the application.

ClassicEvents2

The problem arises when the object C does not unsubscribe from the event. In that situation the event will retain its reference to object C and that will cause object C to leak in memory. Fixing this scenario is straightforward: the client code should at some appropriate moment unsubscribe from the events it subscribed to. What the “appropriate moment” is depends on the situation. For example, a control could subscribe to such events when loaded and unsubscribe from them when unloaded. There are cases however when finding the appropriate moment to unsubscribe is difficult and setting up the correct subscribing / unsubscribing mechanism can be a complex undertaking.

Weak references to the rescue

To address the case where finding the appropriate moment to unsubscribe is difficult, the “weak event” pattern was introduced. The idea behind it is to have a subscription mechanism that does not retain direct references to the objects that subscribe to events. Instead, the subscribers are referenced via a WeakReference. With a mechanism like this, if a subscriber omits to unsubscribe from events it can still be garbage collected after it runs out of scope. The implementations of this pattern differ from the classic event mechanism in ways other than just using weak references. For example the focus shifts from the concept of events to the concept of messages. Also, with classic events it is necessary that the client code has some knowledge about and access to the service that is providing the event. With messages, the client code only knows that certain messages are raised and may not know the identity of the service that publishes the messages.

The dangers of the Weak Event pattern

The problem with a system that implements the “weak event” pattern is that when using weak references one cannot differentiate between a subscriber that is active (in use) and a subscriber that is garbage collectable but was not yet collected – in other words a zombie. If the subscriber is a zombie, when the event is raised (or the message published) a weak event manager will inadvertently resurrect the zombie, call its event handler and then let the subscriber become a zombie again. Having the event handler of a zombie executed can lead to problems. Let’s consider as an example a UI control that subscribes to an event using a weak event manager. At some point that control will go out of scope. It will disappear from the screen and no place in the code will still have a direct reference to it. Regardless of that, unless the control instance is actually garbage collected, its event handler (or message handler) will still be called. At some point during the execution of your application you could have 10 instances of a control, one instance that is actually shown on the screen and 9 instances that are zombies. When the event they subscribed to is raised, one event handler will execute as expected. Also, nine other event handlers will execute and it is fair to say they will do so without the developer intending it. Imagine the event handler performing a database operation. Maybe charging a fee, or update certain data and hence overwriting another update that was just performed by the “legitimate” event handler. Or imagine the event handler taking other actions that are no longer valid because the context associated with such a zombie control is no longer actual. It is easy to see how this type of scenario can cause crashes, data corruption or other inappropriate actions like overcharging a client.

What is worse, depending on run time behavior and timing, a bug like this may not manifest in an application except in certain circumstances that are influenced by many factors like the amount of memory installed on a device, the memory used by the application, the timing of the event and so on. You can easily run in a situation where your application runs just fine for a long time and once in a while it crashes or corrupts data. Good luck debugging it!

Conclusion

Systems that implement the “weak event” pattern often advertise that they are solving the memory leak issues associated with classic events. Such systems may have other legitimate advantages but their attempt to solve the memory leak issue is a double edged sword. While it is true that such system will actually prevent memory leaks in scenarios where subscribers omit to unsubscribe, they will also open the door for subtle bugs caused by zombies having their event handlers called. The cure in this case may be worse than the disease. Do not believe statements that encourage you to use the weak event pattern as something that allows you to neglect unsubscribing from events. Always find a way to unsubscribe from events at some appropriate time. Occasionally that can be difficult to implement and involves putting in place complex code but that is better than the alternative.

This does not mean that I have a blanket recommendation against using weak event managers. Like I mentioned earlier they differ from classic events in other aspects and those differences can in some cases be beneficial. In any case, in situations where there are no clear advantages of weak events over classic events is better to use classic events. If you decide to use weak events just make sure that you do not neglect unsubscribing from events (messages).

Source Code and Demo Application

Source code demonstrating the dangers of the weak event managers can be found at:
https://github.com/ladimolnar/Samples/tree/master/Sources/WeakEvents

As a weak manager implementation the source code uses the messaging plug-in for MvvMCross.
See NuGet: MvvmCross.HotTuna.Plugin.Messenger, version 3.5.1.

Here is a screenshot of the demo application:

WeakEventsDemoAppScreenshot

One thought on “The Weak Event Pattern is Dangerous

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s