Archive for the ‘Software patterns’ Category

What Are APIs Anyway?

Friday, February 19th, 2010

What are APIs anyway?

Everyone’s heard of APIs these days.  Facebook has them, Twitter has them, Hotmail has them, Microsoft Office has them, Windows has them, Mac OS has them, pretty much everything has them.  I’m going to try and explain in simple terms, but also, almost by contradiction, in detail, what an API really is.  I’ll try and explain why SOAP isn’t how you clean yourself, and how being restful isn’t the same as being lazy.

APIs aren’t new, in fact, APIs are really really old.

Let’s start with a really simple definition.  API is an acronym and it stands for “Application Programming Interface”

To steal the current definition from Wikipedia “An application programming interface (API) is an interface implemented by a software program to enable interaction with other software, much in the same way that a user interface facilitates interaction between humans and computers. APIs are implemented by applicationslibraries and operating systems to determine the vocabulary and calling conventions the programmer should employ to use their services. It may include specifications for routinesdata structuresobject classes and protocols used to communicate between the consumer and implementer of the API.”

That’s a mouthful.  So to translate that into English, an API is a pre-defined set of “stuff” that allows one program to “talk” to another program.  This “stuff” is normally a set of “functions” that another program can call which makes the program being called do something.  Sometimes that other program is really just another part of the same program, and sometimes it’s a different system on a different machine in a different country.  That something is normally described in some kind of documentation, written somewhere, by someone.

If this sounds really vague it’s because in the real world, it really is.  Some of the first “APIs” involved dropped text files into directories on a computer that another program would watch for a read from, and subsequently do “something”.  Arguably the most widely used API is the one that we use to write software for Windows (the Win32 API) and by contrast, that’s a C++ library of code that you can only call if you’re a programmer.  In the real world, we’ve tried to solve some of this ambiguity by standardizing the way APIs work around a few common bits of technology; both for our own sanity and to hopefully help software all just kind of work together.

A lot of APIs are code libraries called by other code libraries to do a specific task.  Microsoft released DirectX to deal with 3d graphics, OpenGL offered an open alternative.  Microsoft released the Windows API for Windows development; Apple released Carbon and Cocoa to program Mac OS.  That said when you hear about or discuss APIs casually, what you’re probably thinking about are actually “Web APIs”.

Web APIs

A Web API is an API like anything else, except it’s designed to work over the web.  As a result of this, since about 1994, there have been a number of efforts by a number of people (yes, that vague again!) to standardize the way systems communicate over the internet.

At first, everyone kind of invented their own way of communicating over the internet, people would connect to a server on some port (a port is just a pre-defined “way in”) and send some random pre-defined data in there, and that would make the computer at the other end do something.  Every API was different, and every time that you needed to talk to a different system, you had a really steep learning curve to work out how to talk to every application you wanted to integrate with.  It was a bit rubbish really.

So in 1998 a bunch of really smart people (Dave Winer, Don Box, Bob Atkinson, and Mohsen Al-Ghosein) got together and came up with SOAP (“Simple Object Access Protocol”).  SOAP is a big bunch of XML that was designed (and ratified) as a standard language that every different system could understand. SOAP is often also referred to as “web services”.

Basically it was designed to reduce the learning curve of learning the details of each system.  So now, if Timmy wanted to talk to Peters shiny new web application, he could point his developer tools at a standard location (something called a WSDL (web service description) document) and he could just ask Peters web application what it did, and how he could use it.

SOAP was actually pretty good and solved a lot of problems and is still widely used today.  It really made waves by helping Microsoft software work pretty well with Java software and everyone was happy.

Almost.

See, because SOAP was solving a lot of big problems in freaking huge enterprise systems, it had a lot to be concerned about; lots of security, lots of encryption, lots of authentication.   That meant that the SOAP “language” was pretty big.  As an example, here (stolen from Wikipedia again!) is the SOAP message that would be sent across the internet to get the current stock price of IBM, from some cool stock price giving web service.

POST /InStock HTTP/1.1

Host: www.example.org

Content-Type: application/soap+xml; charset=utf-8

<?xml version=”1.0″?>

<soap:Envelope

xmlns:soap=”http://www.w3.org/2001/12/soap-envelope”

soap:encodingStyle=”http://www.w3.org/2001/12/soap-encoding”>

<soap:Body xmlns:m=”http://www.example.org/stock”>

<m:GetStockPrice>

<m:StockName>IBM</m:StockName>

</m:GetStockPrice>

</soap:Body>

</soap:Envelope>

People soon started thinking “Look at all that crap! What’s it all for! Why do I need it?! There’s lots of overhead here!  It slows me down!” and everyone thought “oh actually, well said” and RESTful web services were born.

REST is yet another stupid acronym that actually doesn’t count as an acronym because it uses random letters but developers think is either cool clever or funny.  What it claims to mean is “REpresentational State Transfer” and was largely both a reaction to the complexity and overheads of SOAP, but also being designed by Roy Fielding (one of the authors of the “Hypertext transfer protocol”, the silly http:// bit you type in a browser) it was conceived as an API model that was “closer to the nature of the web”.

What this means to us lay folk, is that while SOAP can technically be used over other protocols (it’s not bound to http), and as such has to bake in security and authentication models into its protocol, REST is designed specifically for the web, it doesn’t work without the web, and it wouldn’t make any sense without the web.  Fielding basically decided that “the web does all of that stuff anyway”, we have security via https / SSL certificates, we have semantics that describe getting and pushing data in the HTTP headers (HTTP headers explain what you’re trying to do to the server you’re connecting to, for example, you use GET for getting stuff, POST and PUT for doing stuff), so let’s just use that and be done with it.

As a result of this way of thinking, REST is a simplified and less general Web API pattern, not concerned with infrastructure like SOAP is.  I’ll convert the above stock price example into a REST example below.

GET /Stocks/Price/IBM HTTP/1.1

Host: www.example.org

Content-Type: application/xml; charset=utf-8

Done.

Basically, REST takes out all the fluff, an decides that if you want to get a stock price for IBM, just make a GET request to the URL http://www.example.org/Stocks/Price/IBM and let the web server at the other end work out what to do from the URL.

The above example would probably return an XML document that looks like this.

<stocks><stock name=”IBM” price=”3.45”/></stocks>

People have started to gravitate towards REST due to its simplicity.  There’s lots of other stuff a RESTful web service should do, it should provided you with all the information that a computer would need to go through a process, just like a user interface gives the user all the information they need to click through a process, but fundamentally it’s simple and leverages the existing semantics of the web to its advantage.

Ok, Give Me One Of Those!

So now we know what an API is there to do (let two systems talk to each other) and how they do it (as a general rule, either via SOAP or RESTful services) and we know that everyone else has one, let’s get one of our own.

There are plenty of tools available in pretty much any programming language to make making creating APIs pretty easy.  There are plenty of design concerns to take into account when building an API but in my opinion, the way to approach the problem is to work out what your users really want to do with your application, website or platform, and let them do it.

A good API lets another programmer talk to your system using terms that he understands, so take the time building up a glossary of your business terms vs. what the public understands those concepts to be.  Don’t build API methods that look like:

/PaymentResolutionProcess/PaymentResolveTable3/Resolve/EntityId/123

because it means nothing to anyone, instead, build a bunch of methods that make sense for people to use, the above example could look like this instead:

/Payment/MakePayment/123

Watch your language, and build sensible interactions with your system.  Don’t make APIs for stuff that isn’t going to be used, and where possible, just have the APIs call the same code that your website does.

Get this right, and you’re on a gradual but successful road to calling your “website” a “platform”.

Writing Presentable Code Pt.1 – Properties and Variables

Wednesday, November 4th, 2009

At work we’re currently discussing coding standards, specifically to synchronise development in two countries and keep the style consistent across the teams.  You know, the usual stuff. 

When people start discussing coding standards, it quickly devolves into a religious debate and honestly, I think a lot of it comes down to personal preference.  Because of that, I’m going to spend a post or two telling your why you’re wrong and why I wouldn’t take your code to dinner however much it offered to put out.  Because clearly my way is the only right way!

Joking aside, I want to go into some detail on how I present and write my code, and hopefully explain why.  It’s all going to be slanted towards (who do I think I’m kidding, it’s going to be in) C# so as ever, your mileage may vary with any advice you extrapolate.  I’m going to start out by showing you some bad examples, attempt to explain why I think they’re bad, and offer my alternative.

Properties and Variables

The way you declare your properties and variables is seemingly insignificant, but if you get it wrong it trashes the readability of your code.  Take this code sample for example:

image

All I’ve done in the above screenshot is declare a few properties, a few instance variables and a constructor.  And it looks awful and un-maintainable despite the lack of any significant code smell, all due to the manner in which I’ve declared the variables.  It’s a laundry list of mistakes.

  1. Using field backed properties when an auto-property will suffice.
  2. Defining auto-properties split across multiple lines for no explicable reason.
  3. Adding utterly redundant code comments (the code-criticism comments aside).
  4. Terrible and ambiguous variable naming.
  5. Variable names that contain hints at data types.

The above code sample is practically unreadable, even without the comments, it’s long winded and obtuse:

image

Now, I come from the school of thinking that is pretty much convinced that typing things is bad, repeating yourself is bad, hell, writing code is bad.  So don’t.  Less really is more, pick your favourite buzz phrase.  Cleaning up your code should involve making it as simple and as clear as is humanly possible.

Thankfully, if you take advantage of the language features of C#3, you can quickly make something like that look like this:

image

Just by tidying up the way you declare and use your variables, you can make your code eminently more readable.  If you compare the two examples, you’ll see that all I’ve done is

Use single line declarations for auto-properties.

  • Why waste 3-5 lines on an auto-property that can easily fit one one without any loss in readability.

Removed data backed properties in exchange for auto-properties with access modifiers on the setter.

  • Functionally equivalent and far neater

Renamed badly named properties (in the first example “FLineOfAddress”) to be more meaningful.

  • Remove abbreviations where possible, they damage readability
  • Assume the maintainer of your code has no business knowledge, make things easy
  • Meaning is always better in variable names than in comments / meta-data
  • Don’t fear long variable names, modern IDEs have auto-complete, you don’t have to type that stuff by hand.  Embrace your tooling!
  • If you can’t tell what’s in your property or variable from it’s name, you’ve failed, go back and try again.
    • This honestly includes stuff like foreach(var item in MyCollection) and StringBuilder sb = new StringBuilder();  Both bad and wrong, don’t do it.

Only retained comments where the comment data is truly meaningful. 

  • The above example isn’t particularly good (everyone knows what a URL is), but only keep comments in your code where they add something that you couldn’t attain with careful renaming and code restructuring / refactoring.  The meaning of your code should be obvious to the reader without metadata.

Stick to a solid naming convention for public / private / protected variables and properties.

  • The well trodden convention I’m following above is lowerCamelCase plus…
    • A leading underscore for private instance variables (determining scope)
    • Regular lowerCamelCase for local variables
    • UpperCamelCase for property names, constants and statics.
    • No data types in your variable names.  This is not 1980. The IDE gives you all that lovely meta-data, don’t give yourself RSI duplicating it in your variable names.

Cleaning up usage

  • Removing this., you get the same scoping from using _ by convention in your variable names, save those fingers from RSI…
  • Using instance and local variables instantly becomes clearer by sticking to convention.

Using var to reduce duplication in code.

This is often controversial but I feel that using var, for the most part, reduces the amount of typing required without any loss of clarity.  Take the following examples:

image
It’s clear to me that no clarity is lost by not typing "StringBuilder” twice.  It’s still right there in front of you and allows you to keep your variable declarations more uniform.  Despite popular misconception this doesn’t affect the type safety of C#, the language and your variable are still strongly typed, the compiler just infers that when you said var you meant StringBuilder at compile time.   If it isn’t really obvious what an object is when you instantiate it, you’re probably doing something really wrong elsewhere.

People occasionally like to argue that while for declarations var is all well and good, when you’re using it for return values it causes a loss of clarity.  It’s an interesting point but always feels slightly off the mark to me.  Whenever people attempt to give me an example of this lack of clarity, it’s always that th
eir variables or properties are ambiguously or inappropriately named, and the code clarity can be regained and even improved by naming the variables involved in a more descriptive way.  Take the following snippet for example:

image
In the first case, I’d agree that using a var called “l” to store the return value of that method would lead to a loss in clarity.  But if you had string l = RetrieveTextLabel(); and then, say, 20 lines down attempted to use a variable called “l” you’d probably deserve a swift kicking for naming something so poorly.  By contrast, var textLabel is exceptionally descriptive.  People also occasionally say that using var in foreach loops causes this ambiguity, but again, if you name your collection appropriately and your yeilded value correctly, it really is never an issue.

Even more importantly, if you get your naming right, var actually helps you quickly refactor your code.  As long as you understand the “meaning” of your variables, the IDE can fill in the blanks with regard to data types, because for the most part, it really doesn’t matter what type of data is actually in that variable when you’re reading the code as long as it’s meaning is a known quantity.  I actually feel that the dynamic language crowd learnt this lesson long ago, and people that work predominantly in strongly typed languages actually tend to rely on the type system like a crutch to excuse terrible naming conventions.  Time to learn from PHP…

    In conclusion…

    To make your code readable you should stick to conventions for naming, always strive to add meaning in variable names and be as brief as possible.  Don’t litter your code with crap and you’ll be thankful for it later.

    Obviously, this is all my opinion, but I swear by it.

    I’ll be following up this post in the next few days with some continued patterns for readable code.

    Reusable Editable Fields for ASP.net MVC Using jQuery

    Thursday, October 8th, 2009

    A friend recently asked me about editing items inline using ASP.net MVC, the kind of thing that was auto magically wired up with post backs in “old fashioned” asp.net so I’ve whipped up a small example showing how you can use jQuery to declaratively set up interactive field editing with a sprinkling of Ajax and JSON.

    I’m basing this example on the default ASP.net MVC starter project for brevity (download attached) but here’s an overview:

    First you need to set up an Action method (or multiple action methods) on your controller to accept the modification of data.  In my example I’ve added an unimaginative method called “SetField” to the HomeController that looks like this:

    public ActionResult SetField(string fieldName, string fieldValue)
    {
        var response = Json(fieldValue);
        return response;
    }

    As you can see, it doesn’t do very much (useful implementation left to the reader) but it accepts the parameters of a field name, and a field value.  You’ll need to roll your own validation and sanity checking here.  It then returns the fieldValue using the MVC Json helper object, as a Json object.  In a real world example, you’d want to call and update in this method.

    Now, in the view, jQuery does most of the hard work.

    First I added a few CSS classes to the header on the master page (for the sake of example):

    <style type="text/css">
        .editableItem { display: block; }
        .fieldViewer { display: block; }
        .fieldEditor { display: none; }
        .editableItemCancel { display: block; }
        .editableItemBox { display: block; }
    </style>

    I then added an example to the view that looked like this:

    <div class="editableItem" id="editable_FieldName">
        <div class="fieldViewer">Click me to edit me!</div>
        <div class="fieldEditor"><input class="editableItemBox" type="text"/><span class="editableItemCancel">cancel</span></div>
    </div>

    With this HTML I set up some conventions that I’ll rely on when using jQuery.  Firstly, every editable item should use the class “editableItem” and have the id “editable_FieldName”.  I use the class in a jQuery selector and the Id to establish which field is being edited.  Inside the editableItem should be a fieldViewer, containing the current data, and a fieldEditor, which is hidden by default, and contains some kind of editable controller and a cancel button.  You could insert these elements at runtime if you wished, but in order to keep the example simple I’ve declared them in the HTML.

    Next I added some jQuery… The jQuery defines some Javascript behaviour associated with the classes used in the HTML, this way, the mark-up can be reused to edit multiple fields rather than being keyed to the Id of a specific field.

    <script src="/Scripts/jquery-1.3.2.js" type="text/javascript"></script>
    <script type="text/javascript">
        jQuery(document).ready(function() {

            $(".editableItem .fieldViewer").click(function() {
                var parentId = $(this).parent().attr("id");
                $(‘#’ + parentId + " .fieldEditor .editableItemBox").val($(‘#’ + parentId + " .fieldViewer").text());
                $(‘#’ + parentId + " .fieldViewer").toggle();
                $(‘#’ + parentId + " .fieldEditor").toggle();
            });

            $(".editableItem .fieldEditor .editableItemCancel").click(function() {
                var parentId = $(this).parent().parent().attr("id");
                $(‘#’ + parentId + " .fieldViewer").toggle();
                $(‘#’ + parentId + " .fieldEditor").toggle();
            });

            $(‘.editableItem .editableItemBox’).keypress(function(e) {
                if (e.which == 13) {
                    var parentId = $(this).parent().parent().attr("id");
                    var fieldName = parentId.replace(/editable_/, "");

                    $.post(‘/Home/SetField’,
                    {
                        fieldName: fieldName, fieldValue: $(‘#’ + parentId + " .editableItemBox").val()
                    },
                        function(data) {
                            $(‘#’ + parentId + " .fieldViewer").text(eval(‘(‘ + data + ‘)’));
                            $(‘#’ + parentId + " .fieldViewer").toggle();
                            $(‘#’ + parentId + " .fieldEditor").toggle();
                        })
                }
            });

        });
    </script>

    Quite simply, if you click the editable field, it toggles into a textbox.  If you hit enter on the textbox, the value is posted to the previously defined Action on the Controller.  If you hit cancel, the display is toggled back.

    I’d not recommend copy and pasting this exact example into a production system, but hopefully it’ll guide you through a simple scenario.  You can use a similar technique to add all sorts of little Ajax tricks (auto-suggest, lookups, dynamic menus) to your ASP.net MVC site using jQuery and Json (both of which are included in the core asp.net MVC framework).

    Download the example solution here

    Designing Client Facing APIs – Best Practices

    Wednesday, September 30th, 2009

    With the popularity of service oriented architectures and other buzz phrases related to software as service, good API design has become a significant selling point for any software platform in the past 5-10 years.  People make purchasing decisions based on how easy it is to interoperate with your applications and code and as such the number of client / public facing APIs attached to software has skyrocketed.  I’d like to believe the days of dropping strategic text files in directories to trigger some action or another in an application are behind us.

    In this article I’m going to talk about the following things

    • Why you should choose your method names carefully, and what to call
    • Why pretending to be a data access layer is a terrible thing for an API to do
    • Talk about the dangers of leaky abstractions in an API
    • Explain the benefits of creating a data contract between you and the calling code
    • Explain why it’s vital to support standards
    • Make sure that your users can retrieve values they’re going to want to modify
    • Suggest supplying compiled libraries alongside your API documentation
    • Explain why it’s important to keep your API implementation clean
    • Talk about the benefits of dogfooding your API
    • Consider supporting atomic operations including rollbacks on failure
    • Discuss bulk operations
    • Try and convince you that both logging and security should be first class citizens
    • Beg you to maintain integration tests and most of all to keep it simple!

    Your API Sucks

    You’ve probably used an API and there’s a good chance you’ve had to write one.  This probably won’t surprise you; most APIs suck.  They’re horrible to use and built around illogical leaky abstractions that leave you flicking through huge wads of documentation just to make the most rudimentary feature work.

    About 18 months ago, after a year of struggling with a broken third party API that almost brought a business to it’s knees by placing significant roadblocks in front of in house development, I was part of a team tasked with designing our own client facing API.  With no desire to expose other developers to the cruel and unusual punishments of software design we’d had to endure, we came to the conclusion that it was really important that we god this piece of the system right.  People say first impressions are everything, and your API design can make or break the faith other developers have in your ability to produce software.  Show somebody a shitty API and they’ll perhaps correctly assume the rest of your code sucks too.

    The Best Man For The Job

    There’s a bit of a trend that I’ve noticed with some of the worst APIs I’ve worked with: they seem to be designed by the wrong people.  The wrong people to design an API are 1) the guy that wrote the internal code to do the job the API is providing access to and 2) the consumer of the API. 

    The guy that wrote the code that the API is calling under the hood will be inherently slanted to implement an API which exposes this functionality and will have a predisposition to creating a leaky abstraction.  This is especially bad for the consumer APIs designed by the internal implementer tend to assume the consumer knows far more than he really does, or has access to internal data that in reality, he doesn’t.

    Conversely, an API designed by the consumer of the API will have a tendency towards solving problems that are not the concern of the API itself.  The consumer will, either accidently or by intention, attempt to offload some of the work that should be the responsibility of the calling code into the API.

    Ideally, the person that’s writing the API will have knowledge of the system internals, but not be the guy that wrote them. A fellow team member with passive experience to the code would be a good person, or ideally, a pair design exercise between the person that originally wrote the code and an API designer, with the consumer as a consultant.

    Speaking The Same Language

    Like a lot of software development, you make good progress when you get your terminology right and understand exactly what you’re trying to produce.  I’ve consistently found that the best way to think of a client facing API is as a orchestrating thin wrapper that summarises, in code, a set of business processes that you wish to expose to the public.

    In order to get your API design right, you need to clearly define and agree on the boundaries of the system with both your internal team, and your consumers.  It’s important that you have a clear understanding of the following:

    • The responsibility of the calling code
    • The responsibility of the API layer
    • The responsibility of the internal code the API makes calls to

    This might sound like a really simple suggestion but I’ve taken part in countless discussions where people on both sides of the API just “presumed” that either the calling code or the API would perform specific functions (data cleansing, logging etc) when in fact, this confusion had lead to none of the implementers bothering to write the required functionality.  Make sure you know for certain what your API is responsible for doing.

    Defining Your API – Tips and Tricks

    Defining your API methods (or the “contract” of the API) is the most important thing to get right and there are several vital things to remember.

    • Choose Your Names Wisely Using the language of the business

      It’s vital that your API methods speak in terms that the caller is going to understand.  Your API should be readable.  If your users go hunting for the documentation every time they want to use a method, then you’re probably doing it wrong.

      Clarity in naming is exceptionally important.  The names of your API methods should succinctly state what action that method call is going to perform.  Don’t fear using long method names, embrace them for clarity.  As a general rule, your pmethods should probably always be in the form DoSomething(object withThis);

      Ensure that when naming methods you reflect business operations in the method names, not the underlying implementation.

      Bad example:     void InsertToTblCustomer(string[] custDataValues);
      Good example:   void AddCustomer(Customer customer);
      Good example:   void DisableAccount(string accountId); 

    • Don’t pretend to be a data access layer

      APIs should summarise business operations in a logical and meaningful fashion.  You are not a public facing data access layer and you should never pretend to be.  If your users want raw database access give them read only permissions on some tables and a copy of SQL Management Studio.  So don’t write methods for CRUD operations in your API (unless you’re writing some kind of online file management utility).

      Bad example:     void InsertToTblCustomer(string[] custDataValues);
      Bad example:     void UpdateTblCustomer(string[] custDataValues);
      There are no good examples!

    • Avoid leaky abstractions

      This is a fundamental and simple rule – don’t expose your callers to anything that they’re not interested in or won’t understand.  If it’s not important, don’t show it.  Don’t code for things nobody will ever need and don’t require your callers to have intimate knowledge of data types or internal categories in your system.

    • Create
      a data contract between you and the calling code

      I’m going to borrow some of the terminology from WCF here because I’ve found it an appropriate label.  Create a Data Contract library for use in your API.  This library should summarise the business process and the outward facing view of your software.  It might contain terminology that doesn’t actually exist in the software itself, but in the business processes that the software models.  Either way, this, and only this, should be the language that the API talks to your callers.

      Where possible, create this data contract in a separate assembly that’s entirely decoupled from your core system and distribute it to people that want to use your API.  This is especially beneficial when using WCF as your clients can generate a service proxy and deal in the same data types that you are in your API code.

      It should be the responsibility of the API layer to marshal the data from your data contract into the domain model of your internal components.

      You data contract should contain every type used to communicate with your API and the object model should be named in a way which is meaningful for the consumers.

      Because your data contract is NOT the object model of your internal components, you’re able to add properties and objects that don’t logically exist in your internal components.  This means that you can perform an operation using some internal component, gather the output in your API layer and then compose the output data in a meaningful way using classes written specifically for the data contract.  This way, by the time the user has access to the output data, it’s in a format and language which they understand.

    • Support standards!  Don’t reinvent the wheel!

      Here’s a true story; while working with an API, my team was faced with the following API method:
      object Run(string request);

      It was the only method on the API, and “covered up” for around 30 methods all made available through one giant black hole in the side of the system.  Underneath that there was an XML format that the request had to be in in order to call the appropriate method.

      If you’re writing an API, stick to some kind of standards.  Ideally, expose a web service endpoint with an accurate WSDL that people can call or use a simple and obvious REST endpoint.  

      Please, please, please, do not make other developer suffer by rolling your own delivery mechanism.  We have enough of them, don’t confuse people by adding some more.  If you’re going to use a raw sockets connection, supply a calling library and stick to some standard middleware like WCF rather than rolling your own. 

      Thousands of people have spent thousands of man years writing code based on existing techniques, you’re not better than all of them combined.

      Reinventing the wheel is never good for anyone.

    • Ensure that the user can retrieve values they’re going to want to modify

      The number of times I’ve used APIs that let me set or update objects without returning the current state of the object is mind blowing.  So always remember: If you’re going to let them set it, let them get it.  Providing an update or “upsert” method without allowing your consumers to query the current state of data in the system is a complete waste of time.

    • Supply compiled libraries to work with your API documentation

      This may not always be possible, but it’s a really good idea to supply a sample implementation and compiled binaries with your API that covers the most common scenarios of usage.  Not only does this prevent the consumer from struggling to get to grips with you API, but it allows you to outline and illustrate a set of best practices for usage.  In an ideal world, the user could just use your sample code in their application, so ensure you license the code appropriately.

      This is an exceptionally good way to deal with any authentication your API may require as you have the ability to provide additional helper classes to perform some common tasks (authenticate –> perform action –> logout, for example).

    • Keep your implementation clean

      Delegate the API logic to your middleware components / reusable libraries.  Do your best to ensure the API layer doesn’t actually contain the logic required to perform operations, just the logic required to marshal the data from the API format into your internal data structures.  The API should simply orchestrate calls to one or more internal methods because your API should simply be exposing existing functionality.

      If the API is exposing some new, API specific functionality, consider splitting this behaviour into a separate assembly or binary to aid testability.

    • Consider dogfooding

      Dogfooding is the act of using the software you’re creating.  I worked on a project where we were developing an order placement system in ASP.net MVC, and as part of the design process we decided that we wanted to have an API that was a first class citizen.  It then dawned on us that in order to produce the API in a way that accurately mirrored the functionality of the website, we should have the website consume the API like any other client would.  The website had it’s own concept of user authentication, and when a user logged in, the web application logged in to the API as the current user.

      Doing this not only ensured that our security model was watertight, but that any additional web functionality would immediately be available to API users because they were actually the same thing.  On top of that, you gain confidence in your own API because you know that it’s called often by your own code, reducing the likelihood of users discovering bugs in your API because it’s not a product you actually use.

    • Support atomic operations including rollbacks on failure

      When implementing your API methods, ensure that if an exception occurs or an operation doesn’t complete, the all the changes made by your API call are reversed.  Consider explicitly supporting transaction scopes in your API to let your consumers compose their own “set” of operations.

    • Support bulk operations where appropriate

      Building support for bulk operations into your API can often prevent performance issues occurring later when a user tries to, for example, insert 10,000 customers sequentially.  Consider pluralising your methods, so instead of providing an AddCustomer(Customer customer) method, provide only a AddCustomers(List<Customer> customers); method.  Doing this prevents callers from overloading your system by bulking data through your API in unintended ways, allowing you to properly cache required data and cater for these bulk operations.

      This isn’t always appropriate, however I’d always strongly suggest offering pluralised versions of methods that you suspect may be used in bulk, in order to help optimise your API calls and reduce the amount of data being transferred over the wire.

    • Logging as a first class citizen

      Don’t wait until somebody asks about API usage to decide to log it.  Build logging into your API wrapper, from the start, at the point of every method call.  It doesn’t need to be fancy,
      and you can use a number of freely available components to handle these logs and log rotation (consider using log4net or log4j for simple log rotation).

      Log each method call and some summary or identifiable element of the data passed to it.  This’ll help you profile API usage, and identify how data changed in your previously closed system.

    • Security as a first class citizen

      Consider the security of your API from the start of the project.  Understand who will have access to your API, which organisations and which individuals.  Do you require roll based security?  Do you need a way to disable API support for specific customers?  Are you transferring data that needs to be encrypted over the wire?

      Beware of over baking your security.  WS-* offers some very robust packet level security features, but if your API doesn’t need them, or is restricted to an internal network, then don’t bog down your implementation in unneeded security.  Beware of making security choices that tie you down to a specific protocol or technology stack – you want to keep your API usable for the consumers.  Do the simplest thing that works.

    • Have integration tests!

      Make sure you have integration tests with mocking at the business logic layer. These tests are for your API wrappers, NOT your logic. The logic should be tested independently, you’re just ensuring your API layer, marshalling and method calls operate correctly at this level.

      With any luck, your business logic should already be tested as part of your existing test suite (which you have right?) but if not, ensure the business logic is tested separate from the API code.

      Consider using a TDD or BDD approach to designing your API calls, designing the specification first in the form of some calling code, then write the code required to make your usage examples compile.  This will help you understand exactly what calls the client will have to make for to your API to achieve specific functionality.  These tests can happily double up as regression tests when you make changes to the API.

    • Keep it simple!  If all else fails, do what the big boys do.

      Always strive to keep your API simple.  Pretend your the consumer at all times.  If you’re unsure of how to proceed, I’ve always found inspiration, both for what to do and what not to do, from reading the API documentation of some large companies that have widely used APIs.  It’s safe to say that the likes of Amazon, Google and Microsoft have had to put some thought into their API designs.  Beware of trusting their decisions blindly, but liberally borrow anything you, as a consumer, would find pleasing in your API.

    I’m not going to try and convince you that by following my advice that your API design will be flawless.  I’m really hoping for a little discussion on this topic as it seems like something that is rarely covered and often “felt out” by the people left to implement APIs for the public.  These are just some lessons I’ve learnt on the way while implementing several public facing APIs.

    Want to talk about APIs?  Send me an email!

    Xml Comment Hell – A Software anti-pattern

    Monday, July 6th, 2009

    One of the most valued practices in software development is brevity.  Writing code is bad.  When you write code you create bugs, and creating bugs is bad.  The solution?  Don’t write much code.  Commenting your code however, has traditionally been seen as a “good thing”.  So much so that most modern programming environments make some provision for document generation and offer style-guidelines for commenting your code.

    I believe, however, that excessive commenting is actually an anti-pattern and should treated with caution, and as far as possible, avoided.  I’m going to illustrate my point with a (somewhat contrived, but in no way unusual or outlandish) example.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;

    namespace XmlDocumentationAntiPattern
    {
        /// <summary>
        /// XmlCommentHell
        /// </summary>
        /// <example>var xmlCommentHell = new XmlCommentHell</example>

        public class XmlCommentHell
        {
            /// <summary>
            /// Description
            /// </summary>
            /// <remarks>String description of XmlCommentHell.</remarks>
            /// <example>instance.Description = "some description";</example>

            public string Description { get; set; }

            /// <summary>
            /// XmlCommentHell ctor
            /// </summary>
            /// <param name="description">Description string</param>

            public XmlCommentHell(string description)
            {
                // Assign description
                Description = description;
            }

            /// <summary>
            /// Gets the description property, but in reverse!
            /// </summary>
            /// <returns>Reversed description</returns>
            public string GetReversedDescription()
            {
                return Description.Reverse().ToString();
            }
        }
    }

    The context of this example is C# but similarly applies in other programming languages.

    If you find yourself engaging in XML comments like the above example, I really believe “you’re doing it wrong”.  Why?

    • Reduced code readability

      The Xml comment clutter in the code above actually reduces it’s usefulness.  The code now takes three times as much screen real estate to maintain and understand.

    • Low signal to noise ratio – too many characters that add no value

      When you’re maintaining systems however large, the act of actually writing code takes time and adds maintenance overhead.  Past the visual unpleasantness the amount of mark up required to document relatively simple methods actually reduces the codes utility.  Only ever type ANYTHING in your IDE if it adds value.

    • Incredibly obvious comments

      There’s a wonderful quote to the tune of “always pretend that the person that’s going to maintain your code is an axe wielding manic who knows where you live”.  I’d like to extend that by adding “so don’t insult their intelligence”.  There are few things I find as frustrating as reading comments that not only add no value but also make me feel like I’ve wasted a tiny piece of my life reading them.  Don’t waste your time or mine.

    • Violating the DRY principle

      The DRY principle is very well agreed upon in software development.  Don’t repeat yourself.  If your XML comments just repeat your method signatures, you’re not only repeating yourself (and thus wasting time) but you’re leaving yourself open to a maintained nightmare as the code changes and the comments are left unchanged leading to confusion and misinformation.

    • Possible misuse – the generation of utterly useless documentation

      When I see projects documented in this way I often joke that someone should run Sandcastle or javaDoc over the code to gain some tangible value in the form of MSDN-esq documentation.  I’m actually wrong.  This is a terrible idea.  Pop quiz time!  What’s worse than tedious documentation that adds no value?  Tonnes of hyperlinked documentation that adds no value!  Think of the poor developer wasting minutes to hours searching for the value in generated documentation before painfully realising that there’s none to be had.

    I occasionally get met with some resistance when I state my opinion on this for a couple of reasons.  People often argue that they do it for the sake of intellisense and IDE tooling, that they do it for generated documentation or just out of habit.  I feel like these are all dangerous reasons.

    Firstly, if you’re generating documentation for documentations sake, you’re either trying to please some form of middle management that doesn’t understand what documentation really exists for (hint: to help developers develop), or you’re wasting time.

    Secondly, if you’re documenting methods for intellisense you’re probably missing the point.  See, intellisense and other IDE self-help mechanisms are very good; they’ll display method signatures and method names with minimal fuss, any extra comments you glue on top of every method will either clutter your GUI, or become ignored white noise, potentially leading to that one important comment going ignored as your developers slowly desensitise themselves to reading the human generated “auto-doc” garbage.

    Finally, if you find that you need to cover your methods in comments because they legitimately don’t make too much sense at a glance then you’re falling foul of a different problem: you have code that’s not legible maintainable and clear.  In this case you’d be far better served by ensuring your methods are named well, take logical parameters and have single clearly defined purposes.  In my experience, whenever somebody says that they need to comment their methods because the method is unclear, they really should be refactoring, not documenting.

    Please DON’T bury your code in comments, just keep it all readable ok?

    Controlling Your Software Development Environment And Release Cycle In An Agile Way

    Sunday, March 22nd, 2009

    Fear Of Deployment

    When you start working on a new project there are four really important pieces of information you should be aware of regarding the deployment of your software

    1. Are you unafraid of deploying your code to production?
    2. Do you know how your code will react in your production environment with production configuration?
    3. Does your code react in the same way in your production environment as it does in development?
    4. Are you confident in the repeatability of your installation procedure?

    Fear and uncertainty in the deployment of software projects is a common theme I’ve seen across all the companies where I’ve worked and it’s never good for your project.  The fear of deployment is always a good indicator of the confidence your staff have in the project they are working on.

    A project with unknown qualities and suspicious (or known but unreported) bugs often causes an serious fear of deployment.  There’s nothing like the fear of deploying an application and reducing a business critical system to it’s knees.

    I’ve worked on a handful of projects where I’ve taken it upon myself to reign in this deployment fear and chaos, and more often than not, the fear has it’s roots in an uncontrolled development environment.

    Configuring A Development Environment

    I have very few suggestions regarding which tools you should use to develop your software.  I think that you gain the maximum benefit by allowing developers to pick and choose the absolute best tools for the job, be that technically or by preference.  Allowing developers to use their tools of choice makes happy developers.

    By contrast however, the one thing I’m really certain about is that every development environment should start with an absolute base configuration that should not be changed.  This is the first way to solve the eternal “works on my machine” developer problem.  In essence, if all the development environments start off the same, there’s no excuse for a “works on my machine” error.  Not only will this reduce friction between team members, but it’ll speed up the development process because you’re all working to one known configuration.

    Furthermore, the development environments alone shouldn’t share this configuration, they should share the base configuration with your production servers.  A huge percentage of software release issues are relating to the differences between production and test and development and the only way to solve what amounts to human error is to make this discrepancy go away.

    In the current IT landscape where virtualisation is painfully simple (and often free) there are no excuses not to have a development environment (or at the very least development staging virtual machine) that accurately and completely mirrors your production server.  If you’re struggling with licenses of third party software, produce mocks / proxies / approximations.  I can assure you that the time you spend building these glorified testing rigs will work healthily towards eradicating development-to-production errors.

    If It’s Worth Doing Once, It’s Worth Doing Three Times

    Another nasty pattern I see frequently is the development-to-production pattern.  If you’re compiling software on your development machines and releasing it to production, “you’re doing it wrong”.  What’s worse than a “works-on-my-machine” conversation with a colleague?  You got it, a “works-on-my-machine” conversation with a system administrator.  It’s not an excuse.  Stop it.

    You need to start dog-fooding your deployments, and this means environment cloning and real test servers.  I always follow a simple pattern; developer machine -> developer test box -> system-test box -> user acceptance test box -> production.

    In order to do this you’ll need a series of environments (physical or virtual)

    • Developer machine
    • Developer environment
    • Test environment
    • UAT environment
    • Production environment

    That might sound like a lot of hardware but it really isn’t.  Use virtualisation and use it well.  Clone your virtual environments if need be.  Use snapshots and rollbacks to test deployments.  It shouldn’t take any time to administer (and I’ll explain why in the next section) and by the time your code is running on production you’ll have moved it through three different environments.  This is absolutely key to having faith in your deployment procedures.  When you’ve deployed your code so many times that you’re sure it works, in an environment that’s a mirror image of production, production is a far less scary place to be deploying to.

    The key to this technique is that the environments all have very clearly defined purposes.  The developer produces his code changes to a system on his development box.  The code seems to work so it’s deployed to the developer environment and checked into your source control repository.  The development environment should be treated as a place where things could potentially get messed up, but it’s also a good place to host joint resources (a database, some middleware) that all the developers make use of.  If the code works in the developer environment it can be promoted to the test environment for the system tester (or a developer wearing a system testers hat) to test.  Once passed on the test box, the code changes can be promoted to the UAT environment for users to preview coming code releases.  Once these changes have been accepted, the code is released into production.

    Beware of release insanity.  It makes perfect sense to frequently deploy to the developer environment but only occasionally promote code up to test and UAT.  This should fit around your way of working, just be sure to keep the conceptual integrity of the environments.  If you find a bug in test, don’t fix it in the test environment, fix it in development. It’s up to you to ensure you know what code is running across your environments to avoid any nasty surprises.

    One Install To Rule Them All

    At this point you’re probably thinking that this all sounds like a huge amount of effort but it isn’t.  The reason it isn’t is that you should, for the most part, be able to deploy your entire system in about 5 clicks / commands.  Don’t let your deployment mechanism be the week link in your application.  There are tonnes of install technologies on the market, both free and costly, to match whichever platform and tool chain you’re working in.

    I wish I could remember where I read it, but one of the pieces of setup advice I read long ago that always stuck with me is that you should aim for a one-click install.  You might not achieve it, but you should aim to get very very close.

    You need to invest time in configuration management for your application.  Build tools to auto-generate configurations for your deployment environments, build tools to automate setup and deployment.  These tools are worthwhile, because from the moment they’re built, you never need to worry about deploying your software.  They remove risk, and anything that removes risk from software development should be embraced.

    Once you’ve build these tools and installers you’ll find your able to do things that you only thought were possible in large companies with seemingly limitless resources.  You’ll be able to configure nightly builds using a continuous integration package of your choice and have the nightly builds install themselves in your developer environment,  You’ll be confident that at any point you can provide a working copy of your software on request.

    No Fear Of The Unexpected

    The wonderful benefit you get at the end of this process is confidence in your software.

    So you’ve released your system to production, do you then feel the need to slavishly test the system in
    the live environment? 

    Imagine that urge going away, knowing that by the time the code hits production that any environmental glitches that could manifest have had the ability to fail three times already.  Any amount of cursory testing you could perform at this point should pale in comparison to the testing you did in the test environment and the user testing environment.  If your “at a glance” testing somehow discovers a bug, you’ve done something very very wrong in the lead up to release.  You should be confident in the fact that your code has been run exhaustively and often by the time it’s reached production.

    The world isn’t perfect however, and it’s obviously advisable to a quick audit of the software once installed, just don’t force yourself to repeat your system tests in production (an obvious anti-pattern) and you’ll gain no benefits from doing so.

    If you can devise an automated tool to verify your deployments after install, you’ll be able to round off your sense of deployment-security.

    A Zen-like State Of Satisfaction

    I opened this piece with a few simple questions.

    • Are you unafraid of deploying your code to production?
    • Do you know how your code will react in your production environment with production configuration?
    • Does your code react in the same way in your production environment as it does in development?
    • Are you confident in the repeatability of your installation procedure?

    I can answer yes to all three questions because I understand my environments and I hope you can too.

    Why I Love Stand Up Meetings And How To Make Them Work For You

    Friday, February 27th, 2009

    I hate meetings but I love stand up meetings- contradiction?  No.

    They’re a brilliant, informative, low impact way to keep communication fluid amongst your (small to medium sized) team, regardless of profession.

    I’m going to talk a little about the reasons I’m writing about this, before getting on to the anatomy of my perfect stand up meeting, then tell you what stand up meetings could do for you and your team.

    The Media Circus

    My partner works in magazine journalism as the chief sub-editor for a title with a reasonably large international circulation.  Consequently, every month without fail, press week is a nightmare.

    As is apparently the norm in the publishing industry (or what’s left of it) the eternal cycle of content creation is a very stressful process.  Freelancers are always late; things never hit the editors desk on time, people have to be continously chased up, facts need to be checked, and all before a press date.  If this isn’t complete by the press date, the company starts haemorrhaging money by the hour as the presses are literally “stopped”.

    All of the above is exaggerated by the fact that writing is one of the industries that allows single workers to effectively “go dark”, drop off the radar and appear x-days later with finished content.  Not exactly communicative, ironically.

    You Were At Work How Long?

    She’s quite durable in regards to getting work done and quite effective at not letting the battle of press week become emotional or stressful, but I get the distinct impression it can be highly frustrating when you’re working until 11.30pm and getting the last tube home for a few days at the end of every month just to cover for the lack of structure to publishing a medium sized monthly title.

    This pattern cumulated in a conversation with the good lady, with her expressing a little bit of exasperation on the topic and wondering if there was actually anything she could do to ever change this bad pattern of behaviour and improve the workflow of the entire team.  To me, there were a few simple tricks that could be employed to help facilitate change.

    The Programmers Perspective

    As a stark contrast I work in technology, specifically software development using what commonly comes under the banner of “Agile Methods” (there’s an on-going debate as to what counts as an agile method, but that’s beyond the scope of this piece, and frankly, my patience).

    I’ve been lucky enough to only suffer one professional role that insisted on following an old trusty waterfall model of software development- that is to say, a software project where you first sit down and collate a monolithic requirements document, then produce a complete system design, then implement it, then deliver it.  Because of this, I’ve been working “agile” since my second professional role, and the first thing I was introduced to on my first day was the concept of the “Stand Up” (meeting).

    I’d never come across one before, but they turned into one of the most useful parts of my day, and for something that lasts the best part of 10 minutes, that’s saying something.

    What’s A Stand Up?

    To quote the Wikipedia entry in it’s entirety:

    “A stand-up meeting (or simply stand-up) is a daily team meeting held to provide a status update to the team members. The ‘semi-real-time’ status allows participants to know about potential challenges as well as coordinate efforts to resolve difficult and/or time-consuming issues. It has particular value in agile software development processes, such as Scrum, but can be utilized in any development methodology.

    The meetings are usually time boxed to 5-15 minutes and are held standing up to remind people to keep the meeting short and to the point. Most people usually refer to this meeting as just the stand-up, although it is sometimes also referred to as the morning roll call or the daily scrum.

    The meeting is usually held at the same time and place every working day. All team members are expected to attend, but the meetings are not postponed if some of the team members are not present. One of the crucial features is that the meeting is intended to be a status update to other team members and not a status update to the management or other stakeholders. Team members take turns speaking, sometimes passing along a token to indicate the current person allowed to speak. Each member talks about his progress since the last stand-up, the anticipated work until the next stand-up and any impediments they foresee.

    Team members may sometimes ask for short clarifications but the stand-up does not usually consist of full fledged discussions.”

    You’ll notice that there’s a hell of a lot of references to software development both in the prose and on the Wikipedia page and it surprised me to discover that outside of technology, stand ups aren’t a common thing, they essentially don’t exist.

    I could hardly believe it at first, but a cursory Google search for “stand up meeting” concluded that outside of software, stand ups either don’t exist or are so rare as to not appear on general business websites.

    For the sake of completeness, it’s worth noting that the internet is always skewed to technological topics in these kind of searches, but when all the results are about programming I can at least see some anecdotal evidence to confirm my suspicions.

    My Favourite Stand Up Meeting Pattern, And What It Gets You

    First things first, I really don’t like calling “stand ups” “stand up meetings”.  I always feel like the word meeting is loaded with negative and tedious connotations- meetings are boring and shitty and people don’t like going to them, period.

    It’s important to note that because a stand up isn’t really a meeting, it doesn’t have the emotional baggage associated with one.  A stand up works best for teams of about 2-15 people (for the sake of brevity).

    The Ideal Stand Up

    A perfect stand up lasts between about 10 and 15 minutes (you’re “standing up” to try and enforce this) and takes place EVERY day.

    The jist is, somebody starts it off (it doesn’t matter who) and everybody states what they did yesterday, what they’re planning on doing today, and anything that they think is going to get in the way of their current task.

    Other team members are allowed and encouraged to comment and ask brief questions after the current speaker has finished.

    The meeting should take place roughly 20 minutes after the start of the working day, to allow people time to sit down, get a coffee, have a chat, read their email and work out what they’re doing.

    What You Get

    The idea is, through the use of this simple 1 to 2 minute exchange from each person, everyone has a much clearer understanding of what each team member does.  Just ask yourself, do you REALLY know what all the members of your team do?  This is especially pertinent if your job involves managing your colleagues in some way.

    The other participants should be encouraged to ask questions and comment after the speaker has finished but in-line with the comments they just made.  The idea behind this is that another team member may well have solved a problem that the speaker was suffering, but due to isolation, the pace of the workday or a lack of communication the fact that the problem has been solved may have been lost.

    It’s a great way to share knowledge and has a very low impact on time.  The one caveat is that any lengthy discussions should be followed up privately after the stand up, because if you stray into detail or go off on a conversational tangent you’re wasting collective group time.

    Psychological Effects

    Stand ups make your team work harder and more efficiently.

    That sounds absurd but stay with me.

    Because, of the permanent rolling accountability of stating what you’ve been up to, stand ups are a good way of reducing the likelihood of your staff time wasting.  They’re nice and self regulating, nobody wants to stand in front of their team and say exactly the same thing, every day for a week.  It attracts the kind of attention that most people, lazy or not, prefer to avoid.

    Past that, people enjoy having something new to say every day.  It’s a little like a grown up, caring sharing version of show and tell for the business world.  You don’t want to be the kid that doesn’t participate.  Peer pressure?  Totally.  But it’ll make your team more effective.

    This side effect has two key uses.  Firstly, it positively encourages teams to be proactive, but secondly, it gives the managing member a very solid handle on team activities (or if relevant, the lack of).

    Will This Work For Me?

    Honestly?  I don’t know.  My experience is very slanted towards technology businesses so please remember your mileage may vary.

    I think stand ups are a vital and useful part of an agile team however you shape them, and I’d say it’ll certainly work for you in software development.

    Outside of technology, I genuinely don’t see how these clearly transferable practices wouldn’t apply and give you at the very least, team communication benefits.  Even if you think your team communicates well, ask yourself how much you know about both your colleagues’ roles, and what they’re doing today.

    I found a good article on the anti-patterns of stand up meetings; the stuff to avoid.  I’d recommend having a quick read of it, in order to have all the information.

    If you think the answer could be “not enough” than I’d really recommend you give it a go.  I have no idea if the good lady is going to suggest her team experiment with stand ups to help smooth out the pressures of their press week, but I think it would be an awesome idea for everyone.

    I love stand up meetings and you should too!

    Embrace, Extend, Extinguish: Integration with Uncooperative Systems

    Saturday, July 12th, 2008

    Cornered By Technology

    It’s not uncommon in enterprise software development to be tasked with integrating with a third party platform that just won’t play nicely however hard you try. These scenarios often creep up on you at the most unexpected time, be it as a requirement, or a partnership with an uncooperative or unskilled vendor or technical partner. Even worse, it can often become clear that the relationship between your platform and a third party platform is effectively untenable after you’ve invested significant development effort to get “most of the way” there.

    The Need for Application and Data Separation

    The really bad news is that this isn’t an uncommon scenario, and there are countless businesses operating whilst being held to ransom by systems that they paid for. Unfortunately, these systems cannot be abandoned, normally because the system holds some crucial operational data that the company would cease to function without. Antiquated CRM solutions, Billing platforms and databases are a few of the most common places these issues arise. The fact that you’ve been effectively backed into a corner by a technology partner isn’t the root of the problem however.

    These scenarios normally spawn from a historic or bad choice in application design either on your own part or on the part of the offending partner as a side effect of intertwining corporate data and applications into a tightly coupled relationship. You shouldn’t offer the keys to your business to any single piece of software, especially a piece of software operated by a third party, and it’s of the highest importance that you have a clear exit strategy if you choose to enter such an abusive relationship.

    Anyone that works with real software on a day to day basis well knows that a full separation of software and the data on which it operates isn’t entirely practical. In reality, you are going to end up with applications controlling certain portions of your data in proprietary formats, but with due diligence and a mind for separation of concerns when choosing available solutions, business crippling crisis should be easy to avoid.

    When choosing platforms and service partners, don’t skim over the details of data storage, try and ensure any application that you’re trusting with your business data is storing it in a way that you can migrate that data elsewhere if you so desire.

    All People Perform To the Best of Their Abilities When Available Time and Constraints Are Accounted For

    It’s very easy to become adversarial when faced with a business crippling technology partner or application and it’s of the upmost importance that you keep your ego and temper in check when dealing with an especially uncooperative partner. This sounds like an obvious piece of advice when dealing with people but it’s just as appropriate a guideline when dealing with software.

    Don’t do or say anything you’ll regret, don’t antagonise your opposition (and if you’re unlucky, an uncooperative service partner will certainly become an opponent), and certainly don’t make any knee-jerk decisions regarding your own codebase. It’s important to remember that whilst the system you’re attempting to integrate with may well appear intentionally antagonising or impossible. It’s likely that the functionality of the system was not achievable in any other way, and you genuinely are working with the best product that is available. Concentrate on productive measures, not the conflict and stress of a business threatening problem.

    It’s Not You, It’s Me

    It should be quite evident that there’s only one solution to this particular problem; technical divorce.
    The goals of a project to end an abusive software relationship are reasonably clear cut and simple:

    1. Don’t negatively impact business operation without good reason.
    2. Achieve data and application logic separation.
    3. Escape with minimal impact on other integrated systems.
    4. Take preventative measures; don’t get hurt again.

    Your number one goal is to not disrupt day to day business during this period of technical therapy. You need to continue working with your technology partner on a daily basis in order to operate and it’s important that you don’t anger your partners or hinder the operation of the soon to be replaced system with disruptive development. Following good software development practices in general should ensure you avoid technical issues of these kinds.

    Your second goal is to separate your mission critical data from the dying system. Data ownership is a huge task and not something that can be summed up in a few paragraphs. You may need business analysts to investigate who owns the data in your business and where it should reside. As a good guideline, it’s worth considering that as a business you should own all of your business data with specific applications fed on the data they require to operate. Results should be imported back into a master database or data-store. It’s not a silver bullet or a solution for everyone, but if you approach the problem of data ownership from this perspective then you’re at least thinking along the right track where your applications are secondary to your data needs. An order tracking system should only be concerned with order items and names, a billing system should only be concerned with financial data and billing, try and ensure you don’t get lost in a spaghetti mess of duplicated data.

    Your third and most difficult goal is to make this process transparent to the end user by reducing the impact of the change on your other systems. I’ll spoil the fun now by telling you that this is probably impossible and that the user will see a difference. The key is ensuring that difference is only ever iterative improvement to existing business processes and never prohibitive to productive work.

    The fourth goal is preventative and hard to define. Seeing as you’re going though this painful technical divorce, ensure that the amount of thought and work that goes in to this project is sufficient to prevent this scenario reoccurring. Employ good business analysts to help you design the project, investigate data ownership and data warehousing, invest in technologies that adhere to open standards and allow for data portability. Build your replacement system in such a way that when you want to replace it with the next wave of technology, that your own developers won’t face the challenges you’re facing now.

    Designing a Solution

    In order to escape from this heavily relied upon piece of software we’d do well to learn a few tricks from the people who have participated in Software Modernisation over the years.

    The Wikipedia entry for software modernisation (http://en.wikipedia.org/wiki/Software_modernization) reads;

    “Software Modernization is the process of understanding and evolving existing software assets.

    There is a vast amount of highly functional, operational software, representing enormous commercial value deployed in organizations around the globe. To be precise, existing systems are defined as any production-enabled software, regardless of the platform it runs on, language it’s written in, or length of time it has been in production.

    These entrenched software systems often resist evolution because their strategic value and ability to adapt has diminished through factors not exclusively related to its functionality. Common examples of such factors are a system’s inability to be understood or maintained cost-effectively, inability to interoperate or dependence on undesired technologies or architectures”

    The goals of software modernisation are functionally similar to the goals of replacing an uncooperative system, albeit with a different motivation for the system replacement. This “modernisation” is happening due to the need to maintain defective but business critical operations in the short term rather than retaining currently good functionality in the long term.

    Software modernisation is a whole topic in and of itself, but at its core there are two common methods of modernising software; black box modernisation and white box modernisation. These terms retain their standard computer science meanings.

    Both modernisation practices require the encapsulation of the legacy system in modern code, creating a fresh set of APIs in a modern language, which in turn calls an API or function of the legacy system.

    Black box modernisation involves calling only the publicly facing APIs of the legacy system where as white box modernisation involves a degree of knowledge of the underlying platform and its operations. It’s exceptionally common for extensive reverse engineering of a platform to take place in white box modernisation whilst black box modernisation is more or less a thin wrapping layer.

    We’re going to take the idea of system encapsulation from software modernisation and treat our uncooperative system as though it were a legacy system and develop an API to encapsulate the systems functionality.

    Function Not Form

    During the “aggressive replacement” of the uncooperative system, modern development methodologies will get you a long way. You should keep a keen focus on loose coupling of the interface you create to describe the business needs and the underlying wrapping code of the system beneath it.

    When designing your replacement API, you should endeavour to describe the business needs rather than mapping new API calls to the uncooperative system in a one to one fashion. You should take great care to describe function rather than form, decoupling the core business needs from the implementation underneath.

    Good API design isn’t something you wake up with the innate ability to do, and thoroughly describing the business requirements of the uncooperative system in your new API will likely take several iterations. This isn’t a problem however, as it’s recommend that you iterate through a process of this scope rather than attempt some big bang replacement of the uncooperative system to avoid great risk and inconvenience to the business.

    Your API should creep in to existence, method by method. Existing applications should be updated to use the new API methods as their developed to allow for an iterative migration to the new wrapping code.

    Embrace, extend, extinguish

    Once you’ve designed the API you wish to use to replace your uncooperative system, you actually have to produce code that maps between the two.

    This is no small task and will probably be the majority of the work required by the project. This code is very implementation specific. You may be able to just codify your new APIs in to a few API calls of the uncooperative system itself, conversely you may have to deconstruct the behaviour of the system and replicate it in entirely fresh code. Either way, it’s exceptionally important that at this stage you keep your implementation separate from your API and service layer.

    Thus far we’ve embraced and accepted our uncooperative system, we’ve extended a hand towards it in the form of a neatly designed API, and the final step of your project will revolve around replacing the system entirely.

    Once your API is fully featured, and all the dependant systems in your business are referencing it for any communication they need to perform you can begin seeking a replacement system.

    Choose wisely, at this point you’ll likely be more than aware of the pains involved in another costly migration! Take your time, there’s no hurry, developing your own solution is even an option.

    Once you’ve chosen the replacement, you need to write some code again, this time to translate calls from your new API across to your replacement system. Because you did remember to keep your API decoupled from the implementation, right?

    At this point you’ve succeeded in using an iterative approach to software design to fully replace a system that may have previously seemed inseparable from the business. The ROI of a project of this nature is hard to quantify, but I thoroughly believe that if you feel that you need to embark on a project of this nature, its completion will surely strengthen one of the most brittle parts of your operational business.

    Congratulations! You made it out alive.

    Footnote: Avoid the same thing happening to you

    This is honestly a plea to anyone that’s ever going to write a system they expect other developers to communicate with. Practice designing APIs. Write them well. Document them well. Keep them up to date. Support them. Define concise operations. Be good. If everyone is “good” then hopefully fewer people will have to suffer difficult integration exercises.

    The Vote Of No Confidence In The Entity Framework

    Wednesday, June 25th, 2008

    It appears as though the Microsoft M.V.P’s that were called upon to advise on the technicalities of the forthcoming Entity Framework hit a little bit of a roadblock.

    When I say “a little bit”, it seems as though Microsoft just point blank disregarded their warnings and recommendations in regard to creating OR mappers. It’s probably the first time I’ve seen this kind of scenario end with the technical advisors posting a warning and a general vote of no confidence.

    I’d be lying if I didn’t say I felt a little disappointed that after making steps in the right direction and asking domain specialists for advice, that Microsoft entirely disregarded the advice of the specialists they consulted resulting into what seems like an unusable shipping product.

    I’ve not attempted to use the entity framework in and form, but the technical criticisms in the vote of no confidence are quite explicit, and as a developer who makes extensive use of OR/M, if those criticisms are accurate (which is very likely) I’d certainly treat the entity framework with the same unfortunate disregard.

    At least nHibernate isn’t broken!

    Development Tricks: Debug View

    Friday, February 22nd, 2008

    I’m often surprised that people haven’t come across this wonderful ex-System Internals (now Microsoft) tool.

    Debug View

    Debug view allows you to see debug messages written by any application currently running compiled with debug symbols.  To .NET developers, those would be the messages you can write out with System.Diagnostics.Debug.WriteLine(); 

    Why would you want to do that?  Well… if you grace your application with a healthy dose of debug messages, you can monitor the performance of applications whilst they’re deployed or in a production environment without having to break out the debugger.  Often a nice quick way to see what a system service is up to, or why your web application appears to be misbehaving.

    I currently make use of a combination of this and some IOC magic on a daily basis to monitor in-development services as part of a distributed system (hint: create a wrapper class for writing output messages, then implement it in various ways, Console.WriteLine, Debug.Writeline, etc, even a null implementation to save performance in production environments…) and genuinely don’t know what I’d do without it.

    And hell, if you’re really bored it’ll let you know which Windows components are running with debug symbols compiled in (ActiveSync, I’m looking at you…).

    Now Playing: Zimmers Hole – Flight Of The Knight Bat