With the popularity of service oriented architectures and other buzz phrases related to software as service, good API design has become a significant selling point for any software platform in the past 5-10 years. People make purchasing decisions based on how easy it is to interoperate with your applications and code and as such the number of client / public facing APIs attached to software has skyrocketed. I’d like to believe the days of dropping strategic text files in directories to trigger some action or another in an application are behind us.
In this article I’m going to talk about the following things
- Why you should choose your method names carefully, and what to call
- Why pretending to be a data access layer is a terrible thing for an API to do
- Talk about the dangers of leaky abstractions in an API
- Explain the benefits of creating a data contract between you and the calling code
- Explain why it’s vital to support standards
- Make sure that your users can retrieve values they’re going to want to modify
- Suggest supplying compiled libraries alongside your API documentation
- Explain why it’s important to keep your API implementation clean
- Talk about the benefits of dogfooding your API
- Consider supporting atomic operations including rollbacks on failure
- Discuss bulk operations
- Try and convince you that both logging and security should be first class citizens
- Beg you to maintain integration tests and most of all to keep it simple!
Your API Sucks
You’ve probably used an API and there’s a good chance you’ve had to write one. This probably won’t surprise you; most APIs suck. They’re horrible to use and built around illogical leaky abstractions that leave you flicking through huge wads of documentation just to make the most rudimentary feature work.
About 18 months ago, after a year of struggling with a broken third party API that almost brought a business to it’s knees by placing significant roadblocks in front of in house development, I was part of a team tasked with designing our own client facing API. With no desire to expose other developers to the cruel and unusual punishments of software design we’d had to endure, we came to the conclusion that it was really important that we god this piece of the system right. People say first impressions are everything, and your API design can make or break the faith other developers have in your ability to produce software. Show somebody a shitty API and they’ll perhaps correctly assume the rest of your code sucks too.
The Best Man For The Job
There’s a bit of a trend that I’ve noticed with some of the worst APIs I’ve worked with: they seem to be designed by the wrong people. The wrong people to design an API are 1) the guy that wrote the internal code to do the job the API is providing access to and 2) the consumer of the API.
The guy that wrote the code that the API is calling under the hood will be inherently slanted to implement an API which exposes this functionality and will have a predisposition to creating a leaky abstraction. This is especially bad for the consumer APIs designed by the internal implementer tend to assume the consumer knows far more than he really does, or has access to internal data that in reality, he doesn’t.
Conversely, an API designed by the consumer of the API will have a tendency towards solving problems that are not the concern of the API itself. The consumer will, either accidently or by intention, attempt to offload some of the work that should be the responsibility of the calling code into the API.
Ideally, the person that’s writing the API will have knowledge of the system internals, but not be the guy that wrote them. A fellow team member with passive experience to the code would be a good person, or ideally, a pair design exercise between the person that originally wrote the code and an API designer, with the consumer as a consultant.
Speaking The Same Language
Like a lot of software development, you make good progress when you get your terminology right and understand exactly what you’re trying to produce. I’ve consistently found that the best way to think of a client facing API is as a orchestrating thin wrapper that summarises, in code, a set of business processes that you wish to expose to the public.
In order to get your API design right, you need to clearly define and agree on the boundaries of the system with both your internal team, and your consumers. It’s important that you have a clear understanding of the following:
- The responsibility of the calling code
- The responsibility of the API layer
- The responsibility of the internal code the API makes calls to
This might sound like a really simple suggestion but I’ve taken part in countless discussions where people on both sides of the API just “presumed” that either the calling code or the API would perform specific functions (data cleansing, logging etc) when in fact, this confusion had lead to none of the implementers bothering to write the required functionality. Make sure you know for certain what your API is responsible for doing.
Defining Your API – Tips and Tricks
Defining your API methods (or the “contract” of the API) is the most important thing to get right and there are several vital things to remember.
- Choose Your Names Wisely Using the language of the business
It’s vital that your API methods speak in terms that the caller is going to understand. Your API should be readable. If your users go hunting for the documentation every time they want to use a method, then you’re probably doing it wrong.
Clarity in naming is exceptionally important. The names of your API methods should succinctly state what action that method call is going to perform. Don’t fear using long method names, embrace them for clarity. As a general rule, your pmethods should probably always be in the form DoSomething(object withThis);
Ensure that when naming methods you reflect business operations in the method names, not the underlying implementation.
Bad example: void InsertToTblCustomer(string custDataValues);
Good example: void AddCustomer(Customer customer);
Good example: void DisableAccount(string accountId);
- Don’t pretend to be a data access layer
APIs should summarise business operations in a logical and meaningful fashion. You are not a public facing data access layer and you should never pretend to be. If your users want raw database access give them read only permissions on some tables and a copy of SQL Management Studio. So don’t write methods for CRUD operations in your API (unless you’re writing some kind of online file management utility).
Bad example: void InsertToTblCustomer(string custDataValues);
Bad example: void UpdateTblCustomer(string custDataValues);
There are no good examples!
- Avoid leaky abstractions
This is a fundamental and simple rule – don’t expose your callers to anything that they’re not interested in or won’t understand. If it’s not important, don’t show it. Don’t code for things nobody will ever need and don’t require your callers to have intimate knowledge of data types or internal categories in your system.
a data contract between you and the calling code
I’m going to borrow some of the terminology from WCF here because I’ve found it an appropriate label. Create a Data Contract library for use in your API. This library should summarise the business process and the outward facing view of your software. It might contain terminology that doesn’t actually exist in the software itself, but in the business processes that the software models. Either way, this, and only this, should be the language that the API talks to your callers.
Where possible, create this data contract in a separate assembly that’s entirely decoupled from your core system and distribute it to people that want to use your API. This is especially beneficial when using WCF as your clients can generate a service proxy and deal in the same data types that you are in your API code.
It should be the responsibility of the API layer to marshal the data from your data contract into the domain model of your internal components.
You data contract should contain every type used to communicate with your API and the object model should be named in a way which is meaningful for the consumers.
Because your data contract is NOT the object model of your internal components, you’re able to add properties and objects that don’t logically exist in your internal components. This means that you can perform an operation using some internal component, gather the output in your API layer and then compose the output data in a meaningful way using classes written specifically for the data contract. This way, by the time the user has access to the output data, it’s in a format and language which they understand.
- Support standards! Don’t reinvent the wheel!
Here’s a true story; while working with an API, my team was faced with the following API method:
object Run(string request);
It was the only method on the API, and “covered up” for around 30 methods all made available through one giant black hole in the side of the system. Underneath that there was an XML format that the request had to be in in order to call the appropriate method.
If you’re writing an API, stick to some kind of standards. Ideally, expose a web service endpoint with an accurate WSDL that people can call or use a simple and obvious REST endpoint.
Please, please, please, do not make other developer suffer by rolling your own delivery mechanism. We have enough of them, don’t confuse people by adding some more. If you’re going to use a raw sockets connection, supply a calling library and stick to some standard middleware like WCF rather than rolling your own.
Thousands of people have spent thousands of man years writing code based on existing techniques, you’re not better than all of them combined.
Reinventing the wheel is never good for anyone.
- Ensure that the user can retrieve values they’re going to want to modify
The number of times I’ve used APIs that let me set or update objects without returning the current state of the object is mind blowing. So always remember: If you’re going to let them set it, let them get it. Providing an update or “upsert” method without allowing your consumers to query the current state of data in the system is a complete waste of time.
- Supply compiled libraries to work with your API documentation
This may not always be possible, but it’s a really good idea to supply a sample implementation and compiled binaries with your API that covers the most common scenarios of usage. Not only does this prevent the consumer from struggling to get to grips with you API, but it allows you to outline and illustrate a set of best practices for usage. In an ideal world, the user could just use your sample code in their application, so ensure you license the code appropriately.
This is an exceptionally good way to deal with any authentication your API may require as you have the ability to provide additional helper classes to perform some common tasks (authenticate –> perform action –> logout, for example).
- Keep your implementation clean
Delegate the API logic to your middleware components / reusable libraries. Do your best to ensure the API layer doesn’t actually contain the logic required to perform operations, just the logic required to marshal the data from the API format into your internal data structures. The API should simply orchestrate calls to one or more internal methods because your API should simply be exposing existing functionality.
If the API is exposing some new, API specific functionality, consider splitting this behaviour into a separate assembly or binary to aid testability.
- Consider dogfooding
Dogfooding is the act of using the software you’re creating. I worked on a project where we were developing an order placement system in ASP.net MVC, and as part of the design process we decided that we wanted to have an API that was a first class citizen. It then dawned on us that in order to produce the API in a way that accurately mirrored the functionality of the website, we should have the website consume the API like any other client would. The website had it’s own concept of user authentication, and when a user logged in, the web application logged in to the API as the current user.
Doing this not only ensured that our security model was watertight, but that any additional web functionality would immediately be available to API users because they were actually the same thing. On top of that, you gain confidence in your own API because you know that it’s called often by your own code, reducing the likelihood of users discovering bugs in your API because it’s not a product you actually use.
- Support atomic operations including rollbacks on failure
When implementing your API methods, ensure that if an exception occurs or an operation doesn’t complete, the all the changes made by your API call are reversed. Consider explicitly supporting transaction scopes in your API to let your consumers compose their own “set” of operations.
- Support bulk operations where appropriate
Building support for bulk operations into your API can often prevent performance issues occurring later when a user tries to, for example, insert 10,000 customers sequentially. Consider pluralising your methods, so instead of providing an AddCustomer(Customer customer) method, provide only a AddCustomers(List<Customer> customers); method. Doing this prevents callers from overloading your system by bulking data through your API in unintended ways, allowing you to properly cache required data and cater for these bulk operations.
This isn’t always appropriate, however I’d always strongly suggest offering pluralised versions of methods that you suspect may be used in bulk, in order to help optimise your API calls and reduce the amount of data being transferred over the wire.
- Logging as a first class citizen
Don’t wait until somebody asks about API usage to decide to log it. Build logging into your API wrapper, from the start, at the point of every method call. It doesn’t need to be fancy,
and you can use a number of freely available components to handle these logs and log rotation (consider using log4net or log4j for simple log rotation).
Log each method call and some summary or identifiable element of the data passed to it. This’ll help you profile API usage, and identify how data changed in your previously closed system.
- Security as a first class citizen
Consider the security of your API from the start of the project. Understand who will have access to your API, which organisations and which individuals. Do you require roll based security? Do you need a way to disable API support for specific customers? Are you transferring data that needs to be encrypted over the wire?
Beware of over baking your security. WS-* offers some very robust packet level security features, but if your API doesn’t need them, or is restricted to an internal network, then don’t bog down your implementation in unneeded security. Beware of making security choices that tie you down to a specific protocol or technology stack – you want to keep your API usable for the consumers. Do the simplest thing that works.
- Have integration tests!
Make sure you have integration tests with mocking at the business logic layer. These tests are for your API wrappers, NOT your logic. The logic should be tested independently, you’re just ensuring your API layer, marshalling and method calls operate correctly at this level.
With any luck, your business logic should already be tested as part of your existing test suite (which you have right?) but if not, ensure the business logic is tested separate from the API code.
Consider using a TDD or BDD approach to designing your API calls, designing the specification first in the form of some calling code, then write the code required to make your usage examples compile. This will help you understand exactly what calls the client will have to make for to your API to achieve specific functionality. These tests can happily double up as regression tests when you make changes to the API.
- Keep it simple! If all else fails, do what the big boys do.
Always strive to keep your API simple. Pretend your the consumer at all times. If you’re unsure of how to proceed, I’ve always found inspiration, both for what to do and what not to do, from reading the API documentation of some large companies that have widely used APIs. It’s safe to say that the likes of Amazon, Google and Microsoft have had to put some thought into their API designs. Beware of trusting their decisions blindly, but liberally borrow anything you, as a consumer, would find pleasing in your API.
I’m not going to try and convince you that by following my advice that your API design will be flawless. I’m really hoping for a little discussion on this topic as it seems like something that is rarely covered and often “felt out” by the people left to implement APIs for the public. These are just some lessons I’ve learnt on the way while implementing several public facing APIs.
Want to talk about APIs? Send me an email!