OJ’s rants What would OJ do?

10Feb/098

Always Question the Source (aka “Don’t Lock on Type Objects”)

For one reason or another, I recently found myself perusing some code based on the CSLA framework. While nosing around I came upon a snippet of code that I found rather disturbing. An example can be found here in the function called InitializeAuthorizationRules.

For those who are lazy, here is the particular snippet of code that caught my eye:

lock (this.GetType())
{
  // .. stuff ..
}

If you want to see more, head over there and read up. There are quite a few instances of the code listed above.

So why is this disturbing? If you're not sure of the answer, take a bit of time to go and read up on C#'s lock keyword. When you're done, ask yourself "what kind of object should I be using alongside the lock keyword?".

If the answer escapes you, then toodle over to this little doozy for a blow by blow account. I'll quote a rather gifted developer friend of mine (who for now shall remain anonymous) who summed up nicely one of the issues that could occur if the above code makes it into your code base:

This is terrible, terrible, terrible.

Good luck with the cross-appdomain deadlock which brings down prod and can't be diagnosed without 2 weeks behind windbg.

Soooo true.

Now that you know it's bad you might be wondering how a framework like CSLA managed to get polluted by it. Time to speculate!

You may have noticed that the article I linked above mentions that the practice of locking type objects was actually demonstrated/advocated on MSDN:

This is even done in MSDN sample, makuing it the holy grail.
.
.
Rico Mariani, performance architect for the Microsoft® .NET runtime and longtime
Microsoft developer, mentioned to Dr. GUI in an e-mail conversation recently that a
fairly common practice (and one that's, unfortunately, described in some of our
documentation right now, although we'll be fixing that) is actually quite problematic.

Note: Spelling mistakes and awful grammar in the quote above are, for once, not my fault.

I'm guessing that the author(s) of CSLA were reading up on some multithreading documentation on MSDN and came across a sample which demonstrated locking type objects as shown above. Since they were reading MSDN, the apparent Bible for all things .NET, they may have assumed that whatever they saw could be taken as Gospel.

Unfortunately, no resource is perfect. Not even MSDN.

This is where I have my gripe. The authors of any software should always critique all of the code they come across during the course of development. Whether they wrote it themselves, got it from MSDN, read it on a blog site while researching or saw it in a book written by the author of the language. NEVER EVER assume that the code you are reading is 100% sound.

If the authors had thought about the meaning of the lock statement and had an understanding of exactly what GetType() does (ie. always returns the same reference when called on the same type), then perhaps they might have figured out that using a lock on something that's accessible from any object in the process is a bad idea. It is opening the door for potential deadlocks if somebody else decides to do the same.

So I say again: do not assume that the code you get off the Internet is safe! Scrutinise it. Pull it to pieces. Understand it. Then, if all is safe and you're still comfortable with it, consider using it in your software. Don't assume that the author of the code knows what they're doing...

... unless it's me of course ;)

  • OJ
    @Tarn: You raise some really good points. I'm going to speculate on the other side of your argument (not because I don't agree with you mind you ;) but for the sake of discussion) since I don't really know the motivation behind the design decision nor the overall goals of the framework.

    On the surface, it appears that the code that is being locked is a chunk of code that should only be run once per type. This happens to be the case in other areas of the code as well where quite a variety of things are getting done. Each example looks to be doing a bit of ground work for gathering/generating dictionaries of information which will then be used down the track for other things. Obviously this work should only be done once and hence this is where the issue starts.

    By looking further up the chain at what this is intended to solve it appears that the implementation relies on function overrides followed by calls to base class implementations (so that a wholistic view of a given object is generated appropriately). Pardon me for elaborating here, I'm stating what might be obvious to you for the sake of others who may not be able (or can't be arsed :) ) to look through the CSLA code.

    On the surface it appears that there could be better ways of doing things. One could argue that even if a collection of information is required for a certain type which includes that of its base types, there isn't a need to duplicate information that is common to base types in all instances of the derived classes.

    So the first point I'd like us all to consider is that given the nature of the functionality required, are you actually writing less code using their mechanism over the one listed above (which is very much a "potential implementation" and certainly not one that I would necessarily advocate :) )?

    The second point is that I don't feel that it's even necessary to perform those locks given the nature of the functionality. Without trying to over-simplify the problem, clever use of static constructors can allow creation of appropriate information in a thread-safe manner while utilising class-specific information at runtime is entirely possible. That sounded like a lot of fluff and buzzwords, but I hope the point is clear :)

    If we consider your point regarding convention over configuration, I still think that the implementation in CSLA is lacking. There is potential for deadlocks regardless of the convention you follow. On the other hand, the above implementation forces you to provide an object for locking which is specific to your type. While this isn't foolproof, it certainly locks away the potential for deadlocking and may make it less likely to happen. Using the compiler, along with examples inside your framework, is a much better way to manage this. A well-placed comment or accompanying documentation would work wonders here. Unfortunately, nothing about this is documented in what I have seen of the CSLA code.

    The thing that scares me about locking the type is that your type might be accessible from anywhere in the code. It could be loaded as part of a plugin mechanism which on the surface thinks that locking the unknown type in the same manner is a safe thing to do. I guess what I'm alluding to here is, you don't necessarily know who or what is going to be invoking your code, especially if it's framework code. The caller has just as much right to lock your types as you do.

    I tend not to trust those calling my code, and hence attempt to reduce potential points of failure as much as possible (though I'm far from perfect!). Some people might consider that a bad thing to do, but so far it's proved to be beneficial in my experience. I'd love to hear some stories on the flip side though! Rails is a good example, though I am not aware of any part of Rails that exposes the potential to cause this much damage :) Please share if you know any!

    I don't have a problem with convention over configuration so long as the convention makes sense but doesn't open up a potential world of hurt.

    I am in the same boat as you mate, I can't be too critical either. It's rare to find anything like this done in a way that pleases any more than 50% of developers who use it. I'm certainly no angel and would have done my fair share of awful stuff (though possibly not this bad ;) ). I do, however, love pointing this stuff in the hope that fellow geeks can help me learn something!

    Right, enough babble. Thanks for your comment, it made for great reading. I also forgot to mention in my previous response that you need not worry about comment length. Feel free to say whatever you need to say.

    Cheers!
    OJ
  • @OJ Hi, and thanks for the welcome to the blog. I'm also glad to be involved in a good tech yarn, well done picking up the risk of deadlock in the code and starting it.

    I completely agree that locking a type is BAD and we can all see why. Still I'm glad the language isn't restrictive and lets you do it.

    Your implementation does resolve the deadlocking possibility, however, it may not be in line with other goals of the framework, such as do more with less code. This is obviously because you now have additional and identical wiring in each derived business object class (which we are also told is BAD, ie DRY). Its also worth considering its a breaking change for all the current users of the framework, they'd have to update all their business object classes.

    Another option I alluded to in my first comment is to have a private static dictionary of types and locking objects in the base class. This would mean the base class could lock a private object associated with the type and not the type itself. This would resolve the problem of extra code in the derived classes, the risks of dead lock and is no longer a breaking change for existing users.

    To keep this thread safe you would also need another lock for serializing inserting types/object pairs to the dictionary. The objects would only compete for this lock once for every new type, which is probably acceptable as long as they're not creating lots of new business object classes dynamically (which is possible on the DLR, but probably not supported by CSLA) .

    I think this is also a discussion about using bad programming techniques to solve a problem. Many other application frameworks like Ruby on Rails make heavy use the sometimes hard to swallow concepts of  "convention over configuration". I think this is the same thing, the application framework is saying "hey I can do all the cool stuff in the background for you as long as you follow my convention of not locking my damn framework types!"

    But I agree that if we can think of a way to do it without using a type for a lock, then ideally it should have been implemented to protect the users of the framework and set a good example to other developers looking at the code.  I can't be too critical though, unfortunatly rarely have I seen "ideally" implemented in real software development.
  • OJ
    @Tarn: I recognised your name as soon as I saw it despite you never actually posting comments before. Thanks for taking the time to read and post your thoughts. I too enjoy reading your blog posts. Congrats on your efforts at Devsta. Be sure to poke Andy in the eye for me next time you see him ;)

    @Keef: I feel the same as you and still feel the risks are high.

    @Both of you: This is the kind of discussion that I had hoped would kick off as a result of the post. Time to delve a little deeper. The post talks about why it's bad to lock on Type objects, it doesn't discuss the potential design issues in using this mechanism in the manner in which it's used inside CSLA.

    First of all, I stand by my point: locking Type objects is BAD. It doesn't matter if other developers do not use Type objects themselves when they're dealing with the framework, the point is that there is potential for this to happen. If you're working on a framework, you end up inherently setting "standards" that other devs will follow when they use your framework. In my view, locking on type objets in this case demonstrates to other devs that it's ok, when it clearly isn't ok. The practice should be avoided.

    Tarn, you mentioned that it would only work if GetType() returned the same reference when it's called. It will return the same reference when called on the same type. That is:
    MyType a = new MyType();
    MyType b = new MyType();
    bool check1 = ReferenceEquals(a.GetType(), b.GetType());
    bool check2 = ReferenceEquals(a.GetType(), typeof(MyType));

    In the code above, both check1 and check2 are true. Hence, we're locking sections of code using a key that is accessible across the entire process. This can't ever be a good thing. It provides the potential for deadlocks, which, espcially in a framework, is a really bad thing.

    Personally I feel this is a design smell/issue. There are definitely other ways to solve this problem that do not open the door to (as much?) abuse. Tarn, I agree with what you're saying with regards having statics in base objects and how that won't solve the problem. I still think there are other alternatives. Here's one that is up for discussion :)

    The base object that performs this functionality provides an abstract property/method that returns a reference to a lock object specific to the type. eg:
        public abstract class Base
    {
    protected abstract object TypeLock { get; }

    public Base()
    {
    lock (TypeLock)
    {
    // do important stuff
    }
    }
    }

    In each of the derived classes, they then have a responsibility of providing a lock that is specific to this type in this context. Sorry for the poor variable/property naming by the way ;) It should illustrate the point though.
        public class Derived : Base
    {
    private static object _typeLock;

    protected override object TypeLock
    {
    get { return _typeLock; }
    }

    static Derived()
    {
    // thread-safe, single execution
    _typeLock = new object();
    }

    public Derived()
    {
    // other funky code.
    }
    }

    So this code performs essentially the same thing, but there is no risk of deadlock across other areas of the process because the "key" is per-instance and only accessible within the inheritence hierarchy.

    Now, does this completely remove the possibility of deadlocks? In this case, perhaps not (currently trying to think of a scenario where it might happen). It's definintely better than using a key which anyone can use.

    As we are all aware (and as pointed out by Tarn in a previous comment) you'd need to do something like the following to cause a deadlock in the current CSLA implementation:
    lock(typeof(Derived)) { Derived d = new Derived(); }

    If we used the second implementation we simply wouldn't be able to do this at all. The internal lock uses a protected object as the key, so it just wouldn't be possible.

    So what flaws are there in this implementation? What would be better? Do you think this approach is better or worse than locking on types? I'm keen to see what approach other people would take.

    Thanks for the great discussion guys :) And Tarn, welcome to the blog. It's great having some more smart dudes keen for a tech yarn.

    Cheers!
    OJ
  • Hi Keef, 

    The scenario you describe wouldn't fail or deadlock. The second thread would just block until the first thread released the lock, then continue on its way. 

    If you locked one of these frameworks business object types and then tried to construct an object of that type *inside* the lock, it would deadlock. This is because when the object is constructed it would try and open the lock for that type and would block until the lock is released. The deadlock occurs because the lock can't be released until the object is created and the object can't be created as it needs the lock to be released.   

    There is no doubt other more fancy ways to cause deadlocks, but it won't happen until some user of the framework starts trying to lock a business object type. While its possible to do, its not very likely anyone would have any reason to do it. Especially given the risks of locking types or public properties is documented in the lock MSDN article.

    I'm not saying I think this its a good architecture, just that I think the developers did know what they were doing and the risk is actually very low in reality.
  • Hey tarn.  You write...

    Who’s going to try and lock one of your frameworks business object types and then construct an object of that type inside the lock?


    I don't know the framework in question, but reading this screams deadlock to me.  It's fine in a single threaded application, but imagine if you lock the type in one thread of your application, then another thread (either on another processor core or because the first thread has been pre-empted out by the kernel) tries to create an object of that type during the lock.  It'll fail.
  • I looked at that code file you linked to and it struck me that maybe the author did know what he was doing.

    The class is the base class for business objects which have some kind of AuthorizationRules dependency. On construction of each of these business objects a static call is made to check if the AuthorizationRules for this type exist. If they don’t exist, another static call is made to create them for this type. The lock is there so multiple business objects don’t try to create the AuthorizationRules for the same type.

    Firstly, the lock wouldn’t work if GetType() didn’t always return the same instance. Secondly, using a private static object for the lock (which is the recommendation in the mail archive) doesn’t easily work in this framework. The class is a base class that different types of business objects derive from. If a private static object was used as a lock in this class, then the different types of business objects would all have to wait for the same lock. I don’t think there’s any other simple ways of having this per type locking code wrapped up in this base class (a static dictionary of types and locks might work, but this isn’t my point).

    I’ve only seen one code file and don’t even know the product, but I would I think the developer probably knew and understood what he was doing. I don’t think its good code or architecture but I don’t think it matters that much. Who’s going to try and lock one of your frameworks business object types and then construct an object of that type inside the lock? That’s what you’d have to do to cause a deadlock, and I think you’d find the problem pretty quickly because that section of code could never complete execution.

    I do enjoy reading your blog and I’m sorry for the long comment, hope I’m not breaking any blog comment length conventions.
  • OJ
    @lb: Thanks mate. I'd like to hear Rocky's thoughts as well. Not sure how to get his attention though :)

    I hadn't had any experience with CSLA until my current position. We're using a framework based on CSLA. The thing is, in theory, CSLA sounds fantastic. The accompanying documentation implies that it's trying to solve the world's coding problems -- enterprise app dev should be a breeze, code is minimal, maintainable, etc. The problem for me is that I don't believe one single framework can be made that will do this.

    Some of the implementation details have scared me to death. Custom security model that smells like a half-baked attempt at CAS and wierd "DataPortal" function calling mechanisms that have reflection up the wazoo are just two of the many things that I don't personally agree with.

    Either way, I'm stuck with it. So whinging won't achieve anything :)

    Cheers for the comment. Feedback is greatly appreciated.
  • lb
    i'd love to see if Rocky Lhotka has any thoughts on this.


    CSLA was out in front in the early days of .net -- but I wonder if (like a lot of software) it's struggled to keep up and to change, and maybe in some important ways it's left quite behind now.


    I know I've heard some people offer very strong criticisms of it -- but then others are claiming a lot of success with it.


    Good work for digging deep on the technical aspects OJ.


    lb
blog comments powered by Disqus