Stop (ab)using CPP in Haskell sources

February 1, 2015
Yuras Shumovich

What is wrong with CPP

CPP is a C preprocessor, but it is common to use it in Haskell. That leads to a number of issues.

CPP doesn’t understand Haskell code, instead it assumes C code. It is free to remove insignificant (for C, not for Haskell) whitespace, expand macros in Haskell comments and strings or mess with identifiers that contain ' or #.

Every time you change your .cabal file, e.g. add new module, or update dependencies, cabal regenerates cabal-macros.h file. Then the recompilation checker pessimistically decides to recompile all modules with CPP enabled.

If you use hlint or HaRe, then you probably know what I mean.

It is not rare to see code intercalated with ifdefs that specify different behaviour for different platforms of library versions. Sometimes that is unavoidable though.

Most of the time CPP can be avoided or minimized. The most important tool here is abstraction.

Abstract over specific details

It is not Haskell specific, abstracting is widely used to minimize CPP in C. When you need different behaviour based on the current platform or the version of some dependencies, try to abstract over the difference instead of inlining platform specific code.

At the first glance it may look impossible to do. In such cased I usually simply duplicate code and then refactor it to reduce duplication.

Some times it is convenient to start with an umbrella module that provides a unified interface for the rest of program, and a number of platform specific implementations. Note: you don’t need CPP to select a specific module, cabal lets you conditionally include modules based on the platform or other conditions.

Example: fsnotify

An excellent example of such an approach is the fsnotify package. It defines specific implementations for linux, osx and win32, and one umbrella module. A number of other modules contain common code, so duplication is really minimal.

Note that CPP is enabled only in the umbrella module for two reasons. First off all, it is used to import the specific implementation:

#ifdef OS_Linux
import System.FSNotify.Linux
#else
# ifdef OS_Win32
import System.FSNotify.Win32
# else
# ifdef OS_Mac
import System.FSNotify.OSX
# else
type NativeManager = PollManager
# endif
# endif
#endif

That can be avoided too. To do that we can give the same name to platform specific modules but move them into separate directories, linux, osx and win32. Then manipulate the hs-source-dirs field in cabal file to select the correct implementation. (Make sure to add other implementations to extra-source-files to make sure cabal sdist will copy them into the tarball.)

-- in System.FSNotify:
import System.FSNotify.Platform

-- in fsnotify.cabal:
extra-source-files: linux/System/FSNotify/Platform.hs
                    osx/System/FSNotify/Platform.hs
                    win32/System/FSNotify/Platform.hs
hs-source-dirs: src
if os(linux)
  hs-source-dirs: linux
if os(darwin)
  hs-source-dirs: osx
if os(windows)
  hs-source-dirs: win32

The other use of CPP is to define forkFinally that is missing in older base:

#if !MIN_VERSION_base(4,6,0)
forkFinally :: IO a -> (Either SomeException a -> IO ()) -> IO ThreadId
forkFinally action and_then =
  mask $ \restore ->
    forkIO $ try (restore action) >>= and_then
#endif

The same technique can be used to avoid CPP here. I personally prefer to hide such snippets into a custom prelude and don’t bother with hs-source-path.

(I don’t think CPP should be avoided at all costs, I think that the amount of CPP used in fsnotify is a good compromise. I just used it as a real world example of how to avoid CPP.)

More posts

Atom feed

Atom feed