Archive for March, 2014

Doing Open Source Right

Monday, March 31st, 2014

A brief history of free* and open source software…

The rise of free and open source software is reasonably well documented with several significant projects built and released starting in the 1970s with Emacs and the first version of the GPL. The momentum gathered, pushed on by the popularity of Linux, Perl, Apache and the LAMP stack in the mid to late 1990s.

Free or open source software now drives a huge portion of the web, and during the late 2000’s and early 2010’s the popularisation of source code sharing sites like SourceForge, GitHub and BitBucket along with the realisation that billion dollar businesses could be built and operated on open source software pushed more software towards being “open by default”. What was once perceived as a risk (“giving my property away for free”) started to gain traction in private business and even traditionally open-source-hostile organisations such as Microsoft – started taking pull requests and publishing their source code.

This context is important – even if you’re not the kind of person who previously would’ve ended up writing and publishing your own software, there’s an increasing chance that you now will because the organisation you work for decides open source is worth investing in.

But I’m scared, I don’t understand why we’re doing this?

As a developer, there are lots of passive benefits to “coding in the open” – it’s a great way to learn, it’s a great way to contribute back, and as an individual, it’s the only real opportunity you have to legitimately “take your work with you” from job to job. Open source software can become your professional portfolio, and as it does, getting it right is important.

As an organisation, the motivations behind adopting open source software are obvious – “hey free software!” – but the benefit in publishing your own open source is a little more obfuscated. There’s a moral aspect to it – if you’re building your business on open source software, it’s perhaps the “right” thing to do to give back to the community you’re benefiting from. Giving back isn’t going to make you money – it’ll probably cost you some – but there are good reasons why publishing or contributing back to open source projects is a rational thing for your business to do.

Open source is a great way to attract talent – hiring excellent people is hard and developers who enjoy contributing to open source software will be drawn to businesses willing to pay them to do just that. Their enthusiasm is infectious and will make your teams better. It’s a solid publicity tool to raise the profile of your organisation in the tech community. It’s a good way to enhance confidence in your business amongst technical people – if they can see your code, and it’s good, you’ll win supporters. In the end if you get external contributions back to your open source projects, that’s a nice thing to have.

If that all sounds intimidating, it’s ok – the fear of continual evaluation and scrutiny is human, especially when you consider we’re an industry of professional amateurs learning much of what we do as we go to keep up with the pace of change in the tech industry. As you increase your contributions, you get more familiar with the kind of feedback cycles open source gifts you with, and hopefully it all becomes a lot less intimidating – everyone is in it together. Reading lots of code makes you a better developer, and contributing back makes you better still. You’ll learn from experts, and maybe teach somebody else along the way.

So lets run an open source project!

Like everything else about building great software, open sourcing software requires discipline and effort. But there are some real world, practical tips to making your open source software successful. Remember that open source is a commitment – you can’t trivially “un-open” your software – once it’s done it’s done.

Don’t surprise potential users or contributors

Follow a predictable repository layout. There are some strong language neutral conventions for open source project topology that people have grown to expect. People know to look for familiar signposts in plain text or markdown; README, LICENSE and CONTRIBUTING files are essential and the guidance in them should always be accurate.

The README serves as the top level overview and getting started guide for your project. Compilation instructions, quick-start examples and links to any deeper documentation are essential. The CONTRIBUTING guide should give potential contributors useful information and your LICENSE file will likely be standard and people will expect to be able to see it.

Make sure building and testing is easy

Regardless of language or platform, you should stick to the established language conventions in that ecosystem.

People will give up on your project if the barrier to entry of build and testing is difficult or requires a lot manual configuration. Practically all mainstream programming languages have mature package management solutions, so use them. If you’re doing Ruby, make sure you’ve got a working rakefile, if you’re in .NET, I’d expect “F5” to build and run your project. Keeping that barrier to entry low is essential.

Where possible, leverage cloud continuous integration services to provide confidence in the current build of your software – it’ll help potential contributors know if they’re dealing with “works on my machine” problems.

Guiding contributors

The contributing file is your contract with potential contributors. It should give them useful information. You should make sure that you guide them towards running your test suite, explain how you’d prefer any pull requests or code submissions to be delivered, outline the coding conventions, and highlight key contributors.

Obviously this is a two way street and in order to encourage high quality contributions, you have to keep your end of the deal. It’s common to require a failing test and a fix for any contribution – this’ll make your life easy, but if you don’t publish a decent test suite or set of unit tests, you can’t realistically expect it. A lack of tests will dissuade contributions – would you change some code without knowing what the impact could be?

Be responsive and communicative

You might not want all the contributions that come your way, and it’s perfectly fine as a project owner to say no to a change that isn’t relevant to the software so make it clear what kind of changes you’re interested in. The simplest way to do that, is to guide users to create an issue in an issue tracker before they start working on a code submission. This helps stop people spending time and effort working on code that you later reject, preventing any animosity between potential contributors.

It’s also useful to create a roadmap of issues in your issue tracker, flagging simple changes that may be suitable for first time contributors if you’re looking to encourage submissions. This is a great way to gain confidence in submissions and a clear way to communicate the direction of development.

Finally, it’s important to respond at all. Respond to issues and pull requests in a timely manner, make sure you have a few canned twitter searches or alerts for people struggling with your software, and if need be, use free tools like Google groups to encourage searchable discussions that might help others later rather than private email conversations.

Don’t be afraid of criticism

By publishing your code, you’re welcoming comments and feedback – it’s not always going to be positive, but you should do your best to steer it towards being constructive. It’s worth remembering that if your code made somebodies job easier, or life better, it was worth publishing, even if it’s not the best code you’ve ever written.

Selecting a reasonable license

If you’re releasing source code, you must license it, even if you just want people to be able to “do whatever they want” with it. Choose A License offers excellent overviews of the most popular open source licenses, but the really short version is this:

  • If you care about users of your code contributing back and enforcing “software freedom”, choose a “copyleft” license, probably the latest version of the GPL. If you’re publishing a software library, you probably want the LGPL.
  • If you just want to put the code out in the open, and not have anyone try and use it against you in a court when they destroy their business with it, go with the MIT license.
  • If you’re worried about contributors submitting patented code and were considering the MIT license, you should probably go with the Apache license.

These are the most popular licenses, and they’ll probably cover what you’re trying to do. It’s worth noting that the GPL is a viral license, requiring software that includes GPL’d code to also be released under the GPL – it’s central to the philosophy of the FSF and the free software movement, but can be a barrier to adoption in for-profit organisations who don’t want to open source their own software.

Things to avoid

There are a few anti-patterns when it comes to sharing source code.

Using an open source repository as a “squashed, single commit mirror” defeats much of the purpose. Compressing your commits into single “Version 1.2”, “Version 1.3” commits hides the evolution of the software from people who might have a genuine interest in the changelog. This leads people to believe the the software is “open source in name only” and it’s hostile towards contributions.

Avoid pushing broken builds to the HEAD of your repository – if need be, maintain a development and a master branch, with only good, clean, releasable code going into master. This is just good practice, but when people who you don’t know could well be building on your codebase it becomes a worse than just ruining your colleagues day.

A quick recipe for success

We’ve talked about a broad range of topics here that will help you run an open source project responsibly, and why you’d want to do that – but lets nail down a specific pattern for running your first open source project using GitHub.

  • Sign up for a GitHub personal or company account (free for open source)
  • Select a license (Apache or GPL are sane defaults)
  • Publish your code in a Git repository on GitHub
  • Publish tests with your code
  • Use GitHub issues to construct a roadmap of future features
  • Tag some future features as “trivial” and suitable for new contributors
  • Include a contributing.md file that asks for a test and fix in a pull request
  • Discourage people sending pull requests of refactors or rewrites without prior discussion
  • Include obvious scripts in the root of your repository called things like “build-and-run-tests” to give people the confidence to contribute
 
*Footnote: Free Software vs. Open Source Software

There has long been contention between the concepts of “free software” and “open source software”, and while “all free software qualifies as open source, but not all open source software is free as in freedom” – I’m going to be avoiding the distinction here. If you’re interested in the discussions around this, this summary on Wikipedia is a good place to start, along with GNU’s article on “Why open source misses the point”.

If you’re not familiar with the distinction, free in “free software” is free as in “liberated, independent, without restrictions”, while many mistake it to mean “costs no money”. This is often explained as “free as in speech, not free as in beer” which I’ve never thought as an especially informative one-liner.

ASP.NET MVC 101 – Extensibility Points

Thursday, March 20th, 2014

Here are the slides from a workshop I ran recently that highlight the pluggable parts of ASP.NET MVC, as a primer to “doing things the ASP.NET MVC way” – targeted at people who had mostly been exposed to “ASP.NET Classic”.

HTML Image tags and onerror javascript handlers

Wednesday, March 5th, 2014

I’ve been doing web stuff for a long, long time (since about 1997) and it never ceases to amaze me that sometimes the most trivial things can pass you by.

I was troubleshooting some weird behaviour on a client site this week – some WebDriver automation tests would sporadically hang forever, seemingly waiting for Amazon CloudFront to serve a file. Calling bullshit on the theory that “oh, CloudFront is just being funny”, we decided to dig a little deeper and see why exactly there were images that upon failing to load, would hang forever without timeouts.

Like a good little solider, I skipped all the diagnostics and just went and looked at the code, and discovered an image tag that looked like this:

<img src="some-broken-image-link.jpg" onerror="LoadDefaultImage(); alt="caption" />

I’m not going to lie, my first response was “that cannot possibly work, I’ve never seen anything that looks like that before” – but lo and behold, after visiting a W3Schools link older than time itself, it appears that in HTML (3+? 4+?) all image tags, by default, have onerror javascript handlers baked in that get invoked if the image returns a non-200 status code.

When you work in technology, everyday really is a school-day, and things that apparently are obvious, are frequently completely unknown to you. So I did a couple of cursory google searches…

“img src”: About 12,110,000,000 results (0.21 seconds) 
“img onerror”: About 7,270,000 results (0.21 seconds)

So, about 0.06% of people out there that have heard of img tags, have heard of its onerror handler, which probably qualifies it for a blog post.

Why is this useful?

You like making websites? You have a load of user generated content? You know what sucks? Broken images. They break your design, they make everything look ugly, and they take time to be requested and time-out. Like your websites to be fast? You’re going to want to get rid of these dead images, and the first step in getting rid of them, is knowing about them.

Firstly, you can use the onerror attribute of the img tag to change the img.src and swap that nasty red cross for a nicer default image that doesn’t break your layout.

Secondly, you could go a step further and fire an analytics event to let you know you’ve got dead images rendering in your pages so you can fix them.

Thirdly, with a bit of javascript magic, you can use HTML5 data attributes to “safely load” images that may or may not exist, making sure you switch out bad images for nice defaults without anyone ever noticing.

I put together a trivial example for my client of a page that does just that, has a bunch of images that may or may not exist, and swaps them out silently for a nice “not found” image when the DOM is ready.

<html>
<head>
	<title>Image errors</title>
</head>
<style>
	.some-style {
		border: 10px solid black; 
		width: 100px; 
		height: 100px;
	}
	.safelyLoadImage {
		display: none;
	}
</style>
<body>

<img src="" data-imgsrc="something.jpg" class="some-style safelyLoadImage" />
	
<script src="http://code.jquery.com/jquery-1.11.0.min.js"></script>
<script src="http://code.jquery.com/jquery-migrate-1.2.1.min.js"></script>
<script>
	$(function(){
	
		var notFoundImage = "http://upload.wikimedia.org/wikipedia/en/thumb/d/da/Ziltoidtheomniscientcover.jpg/220px-Ziltoidtheomniscientcover.jpg";
		var realImageSrc = $(".safelyLoadImage").data("imgsrc");
				
		$(".safelyLoadImage").attr("onerror", "this.onerror=null; this.src='" + notFoundImage + "';");			
		$(".safelyLoadImage").attr("src", realImageSrc);
		$(".safelyLoadImage").removeClass("safelyLoadImage");		
	});
</script>
	
</body>
</html>

 

What was the weird timeout thing in the end?

Turns out, if you don’t remote the img tags onerror handler, and then change the img.src to another image that fails to load, most browsers get into a nasty loop – that’s why in the code sample above we’re setting this.onerror=null; in the onerror handler. Suffice to say, WebDriver wasn’t a huge fan of infinitely loading broken images.

Broken images be damned.

Introducing: ReallySimpleFeatureToggle

Monday, March 3rd, 2014

I’ve just open sourced a new NuGet package that’ll help you prefer feature toggles over feature branches.

It’s derived from some battle-tested code used across several systems over the last couple of years and should help you trivially introduce feature toggles into your codebase.

The super happy path looks a little like this:

PM> Install-Package ReallySimpleFeatureToggle

Consider this configuration:

   
<?xml version="1.0" encoding="utf-8" ?>
  <configuration>
    <configsections>
      <section name="features" type="ReallySimpleFeatureToggle.Configuration.AppConfigProvider.FeatureConfigurationSection, ReallySimpleFeatureToggle" />
    </configsections>

    <features>
      <add name="EnabledFeature" state="Enabled" />
      <add name="DisabledFeature" state="Disabled" />
      <add name="EnabledFor50Percent" state="EnabledForPercentage" randompercentageenabled="50" />
    </features>
  </configuration>

With this usage example:

    var config = ReallySimpleFeature.Toggles.GetFeatureConfiguration();

    if (config.IsAvailable(FeaturesEnum.EnabledFeature))
    {
        Console.WriteLine("This feature is clearly enabled");
    }

    if (config.IsAvailable(FeaturesEnum.DisabledFeature))
    {
        Console.WriteLine("You'll never see this.");
    }

    const int maxTries = 50000;
    var wasTrue = 0;
    for (var i = 0; i != maxTries; i++)
    {
        var recalculatedConfiguration = ReallySimpleFeature.Toggles.GetFeatureConfiguration();
        if (recalculatedConfiguration.IsAvailable(FeaturesEnum.EnabledFor50Percent))
        {
            wasTrue++;
        }
    }

    Console.WriteLine("Enabled for 50% was enabled: " + wasTrue + " times out of " + maxTries + " - Approx Percent: " + (100 * (maxTries - wasTrue) / maxTries));

The barrier to entry is really low, and there’s a bunch of extensibility points so you can store your feature configuration in a central location, or add overrides into the configuration pipeline. Hopefully this’ll help you ship your code more often, with a little less fear.

Sold! Give it to me!

Get the source on GitHub: https://github.com/davidwhitney/ReallySimpleFeatureToggle
The package from NuGet: https://www.nuget.org/packages/ReallySimpleFeatureToggle 
Via the package management console: Install-Package ReallySimpleFeatureToggle

Read the documentation here: https://github.com/davidwhitney/ReallySimpleFeatureToggle/blob/master/README.md