The desperate quest for doing it 'right'
This morning I ran into an interesting design decision. The problem at hand isn't that interesting, I've solved it a lot of times before. The interesting thing is that this problem isn't always solved the same way. It goes like this: do you tell an element which is inside a container (which can be inside another container) to exclude (remove) itself from its container or do you tell the container to exclude (remove) the element? This might sound simple enough, but what is the right thing to do here? And if one is chosen, on what ground is that approach the right thing and is that always the case, no matter what the scenario might be? No, "It depends" doesn't cut it, for the sole reason that every single day probably millions of developers around the world are, in any state of desperation, searching for the right thing to do, be it for this or other problems. Check the various Q&A sites, the various newsgroups and above all, the wide range of developer blogs, articles and twitter channels, and you'll see that a lot has been, is and will be discussed about that single concept: the right thing.
When I was confronted with the decision outlined above (more on that below), I wondered how a developer with decades of experience in the trenches like myself still has to wonder about this somewhat small decision and isn't capable of instantly choosing one over the other. Is it the big fear deep in all of us that if we make the wrong decision it might haunt us and eventually will bring us down? Looking at myself, with the massive code base this decision will be part of taken into account, it does bug my mind: if I pick the wrong decision, it might hurt the system, my company, and everyone depending on that. If I have these kind of questions, there must be others with the same question, wondering the same thing: what's the right thing to do. Looking at all the blogs, articles, answers given to similar questions on various Q&A sites, indeed, there are many many people out there wondering that same thing: either showing that by giving advice how to do the right thing (dear reader, if you now start to wonder if this is a recursive blog post, you're probably right), or by asking what it might be.
So I wondered: isn't this quest to find what the right thing is actually haunting our profession and why exactly is this? Why do we care so much? And more importantly: can we ourselves solve this?
This post was partly triggered also by a blog post I read this morning by Patrick Smacchia, where he shows that a toolkit written by Jeremy D. Miller called StructureMap has cyclic dependencies between namespaces and Patrick tries to make a case that this kind of coupling is apparently not that great to have. Reading the post I wondered why anyone would give a hoot about such a thing. Don't get me wrong, I like solidly written software which allows great maintainability and extensibility without a lot of effort, but I couldn't help wondering who decides what's right and wrong and why we should care about these kind of 'rules'. A lot of these rules make sense simply because they are based on common sense, however I still have the feeling that the vast majority of these rules only work in a given scenario, however in many situations the boundaries of these scenario's are omitted, be it deliberately or by mistake. The pitfall is that if these scenario boundaries aren't given, the rule at hand starts to look like a rule which can be applied always as it apparently is a rule which is one of the ones based on common sense and is the right thing to do, as it has no boundaries/scenario given where it does work so it should work always.
With the rewrite of LLBLGen Pro's designer for v3.0 using .NET 3.5, I'm trying to do some things differently compared to what I did in v2.x. One of these things is a completely different set of data-structures to store meta-data. These data-structures give a lot of freedom to reason about the meta-data and as everything is event/observer controlled, it's very loosely coupled. However, I too ran into cyclic dependencies of namespaces (inside a common root namespace in the same assembly). When I detected this, I wondered... "should I correct this" ? But then I thought: "Why? Will the sky fall down if I leave these few cycles in?". So I did what I always do in the case when I have to make a design decision: make a pro/con list and decide what's the better option based on that list, document that decision and why you took it (so you can always read back why a decision was made, the most important thing about design documentation), move on. In this particular case, I didn't see any advantage of refactoring the code to obey some rule which is only important if you're going to split up assemblies (which I'm not planning to do in this case) so I left it in.
You might wonder why I didn't present to you right away the pro/con list of the decision I started this post with. The main reason is that I wanted to show you that there is no such thing as the right thing if there's no context given, or better: if your situation isn't known. This is very important. Today, and in the years to come, you'll likely be exposed to articles, blog posts, books, lectures and what not, written by generous people who simply want to help you out, which will tell you what's the right thing to do. I'd like you to keep one single thought in mind, whenever you read such an article, post etc.: Does the scenario this rule, this good advice, applies to actually match my scenario? If not, be careful to apply that advice without proper thinking it through. Software engineering isn't an exact science: there's no such thing as a formula where you put something in and the result is calculated, our profession is about building an executable form of what's been described as the functionality of a system, however there are no turn-key solutions to make that possible: every situation is different.
Let's go back to the decision I started with and give it some context, a scenario, the situation it occurred in. This might help you with what you thought instantly what the right thing was when you read the first paragraph. In LLBLGen Pro v3, the user can add meta-data obtained from multiple databases to a single project and choose which elements to obtain from these databases, e.g. which tables, which schemas etc. See the screenshot below for an impression.
LLBLGen Pro v3 alpha: Step 2 of meta-data retrieval wizard.
The screenshot above shows how the user obtains the meta-data, a simple click-through wizard with some fancy selection/filtering (not shown). The ability to select elements, also raises the question: "what if I selected a couple of tables I don't want to see anymore in my project?". Or in other words: I want to be able to exclude elements (tables in this example) from the project later on without actually removing them from the database. In the scenario presented to the user this looks something like this:
LLBLGen Pro v3 alpha: Context menu (incomplete, not all features have been added yet!)
The above screenshot shows the ability to exclude the selected tables (Customer and Employee in this case) from the project. This means that the objects representing these two tables are removed from the meta-data data-structure of the Project (through a controller which asks the Project to exclude the elements at hand which will delegate the call further) and all mappings to these elements are cleared. This feature works, it removes the tables from the Tables collection in the schema object representing the schema the tables are in and as everything is observer-aware, events are raised which are picked up by mapping objects which clear themselves if they have the elements excluded as their target, all undo-able thanks to Algorithmia.
When I wanted to implement the feature on the Schema node I ran into a slight problem: a Schema isn't a Schema Element like a table, view or stored procedure, it is a container, although it is contained in a Catalog. I had to refactor the code I had to also support these different elements (catalog, schema, database meta container, which all aren't a schema element). This gave me the hint that the whole setup might not be correct and I should simply create an interface like IExcludable or something, implement that on all elements which are excludable and call an Exclude method on that. Sounds logical and feng shui-compliant with Common Sense Software Engineering (CSSE), don't you think?
However this runs into a tiny problem: does every element know its container? Is the container of a table the schema (its logical container) or the Tables collection in the schema object (its physical container). To be able to work with meta-data, it's essential that a table knows its schema. However it doesn't need to know that it's in a Tables collection. A schema knows its parent catalog, but a catalog doesn't know it's parent container as it's not something it should be aware of (it doesn't have a logical container, although it has a physical container in the Project). Is it better to tell the element that it should remove itself from its parent's container, or is it better to tell the parent that a contained element has to be removed? Example: do you tell the schema that it should remove itself (which in turn will force the schema to tell its logical container (catalog) to remove the schema) or do you tell the catalog that a schema has to be removed?
The interface idea sounds great, but requires that elements know their container (also for the catalog). It gives the freedom to implement the logic inside the elements which it is all about instead of in code outside the data-structures. On the other hand, code outside the data-structures and placed in a method in the Project class, which contains the meta-data, which controls the calls to the right parent is also tempting as it also knows the container of the catalog, something the catalog itself doesn't know. Yes, the Catalog owns the Schemas collection, but does it own the Schema elements inside the Schemas collection?
So it comes down to:
- Interface route. This route requires little code in Project, as an IExcludable is passed into a method on Project by the controller and the project simply calls the Exclude method on the object and the object has to take care of it being excluded (removed) from its parent. We've to make sure the Catalog knows its container as well which is outside its reach at the moment (other assembly) so logically not the right thing to do. Instead for the catalog, this requires some if/else code to call the container of the catalog as well.
- Manager code route. This route uses some switch/case statement in Project and based on the element to exclude, it calls the proper container to exclude the element. This looks straight forward, but places knowledge what the parent of which element is, inside the Project class instead of the element itself. This could be less ideal when for example the parent type changes and you have to hunt for all the references to that parent and change that code everywhere.
One thing to note is that, as you can see in the second screenshot, multiple target database types can be in the same project. Telling a project to exclude a given table object isn't a matter of asking the single container of catalogs to remove the given table object, one first has to find the proper database specific store which contains the catalog with that table. A table itself doesn't have this knowledge of course, its parent schema's parent catalog does, at least the ID of the database.
Still convinced what you picked as the right thing to do in the first paragraph is the right thing, or is it more complicated than what you initially thought? I know this scenario is very specific but that's precisely the point: the question presented is very simple and likely you've made this same decision a lot of times before, as I have too. Yet, the specific scenario, the specific context the decision has to be made in makes things less trivial than it initially looked like.
I don't believe in do this and it will be all right kind of advice without a firm context description, so I'm not going to give you one. Instead, I'm giving you advice on how you yourself might be able to find the right decision in similar and other situations you will run into: make a pro/con list of each alternative, eventually prioritize these pro/con items if you will, and simply look at the lists and make a decision based on them, make the decision which makes rationally the most sense, considering the pro/con lists. That's it. Make a decision, based on rational reasoning and cold hard facts and document it, implement that decision and move on. Don't let your decision be lead by what looks like a turn-key 10-step process to get it right which is applicable to any (and thus also yours) scenario/context. There's no such thing, every situation, every scenario is different, yours is too. Perhaps a guide tells you to componentize everything which will take you 2 weeks to complete and in the end 99% of the components are used by only one other component. Did you gain anything by that? It might be you actually made things more complex than the situation you had before you componentized everything. Who has to deal with that extra complexity? The people who gave you advice in a generic "Use ABC with XYZ and everything shall be great" article, or you?
In the end, what matters is that you a) made a decision and b) you documented the reason why you made that decision and didn't take one of the the alternative(s). If for example, after a year or two it turns out your decision wasn't the best, based on the knowledge you have at that moment, you can always check the design decision documentation you've made of the decision, and conclude you did make the right decision back then, but based on other, perhaps incomplete (compared to the situation after two years) knowledge/information. That's life. As my wise mother always says: "If you'd know everything up front, it's not hard anymore to get rich".
What I decided? For this particular case I chose the interface route with the if/else for Catalog in the Project method. Yes, it's perhaps not that pretty due to the if-statement but the alternative isn't that great either and based on the pro/con items of both alternatives, the interface route seems the best choice. In this context, at this moment, with the information at hand.
Should you do the same thing in the situation where you have to make the same decision? That's not a conclusion you should draw from this post, instead you should take my advice, make the pro/con list and decide for yourself what to do. It might be you make the same decision, it also might be you pick an alternative. That's ok, you have the pro/con list plus the reasoning to prove you made the right decision at that moment. That's what matters.