General Issues with Logging in Bioclipse2

From Bioclipse
Jump to: navigation, search
Development tutorial
Responsible author:Rklancer
Bioclipse version:N/A
"N/A" is not a number.
Last updated:2009-10-10
Tags:


Basic idea of logging packages

The basic idea of logging packages in common use (mainly java.util.logging and Log4j--more about commons-logging below) is configurability. Instead of naively writing to System.out or System.err (thereby forcing your code to check whether it's in "debug" mode before every write, lest it spew debugging output to unsuspecting end users), you write to a logger object managed by a logging system that decides whether, and how, the logger object should actually output what your code wrote to the logger.

Logging packages create logger objects that live in a hierarchical namespace in which configuration is inherited "down" the hierarchy, so that it's easy to configure logging for all your code at once while also allowing parts of your code to use sub-namespaces that are configured differently. The namespace is separated in an obvious way by dots (i.e., the logger associated with the name "my.package.classname" is a child of the logger associated with "my.package", etc.) So, when using Log4j you would do

package my.package;

import org.apache.log4j.Logger;
import org.apache.log4j.Level;

Class classname {
	
// ....
	
	Logger pkgLog = Logger.getLogger("my.package")
	pgkLog.debug("Suppose the threshold for my.package is set to Level.DEBUG, Level.TRACE, or Level.ALL -- then this will be printed");

	Logger classLog = Logger.getLogger("my.package.classname");
	classLog.debug("On the other hand, suppose the threshold for the sub-namespace my.package.classname is set to Level.WARN -- then this won't be printed");
	
// ...
	
}

The other main axis of configurability is the mechanism to be used for logging output. In Log4j, this is specified by the Appenders attached to each Logger, and the Layouts attached to each Appender. The idea is that Appenders accept a message to be logged, format it by passing it to their Layout, and then append the formatted message to the Appender's particular output mechanism (a database, the console, a log file, mail messages, whatever)

As I mentioned, configuration is inherited down the hierarchy. So if you set the appenders and logging threshold at the root level, these settings will be inherited by all of the other loggers (modulo settings that allow inheritance to be "stopped" at particular sub-namespaces; see the Log4j documentation about "appender additivity", for example.) This makes it easy to turn debug logging on or off for your whole application, by changing one line of a configuration file that specifies the debugging level for the root of your logging namespace.

Note that the logging namespace is laid out however you wish--although I (sort of) followed a common convention above, namely for classes to write to loggers whose name happens to be the same as that of the class doing the logging, it is not required that you follow this convention, and it is occasionally preferable to use some other convention. For example, later I'll describe why it might make sense to have "RECORDING.*" and "SCRIPTING.*" namespaces in addition to the base namespace. This would allow us to configure whether or not to treat the logs written by the recording mechanism, or those created by user scripts, separately from the main Bioclipse logs.


Using SLF4J to substitute for the problematic commons-logging packages

There is code in the Bioclipse target platform (mainly in Spring-Core and Spring-DM) that logs using the org.apache.commons.logging (Jakarta Commons Logging, aka JCL or commons-logging) mechanism. Commons-logging works much the same way with respect to namespaces as the logging libraries mentioned above; in fact, it has to, because all it does is delegate actual logging to an underlying library such as Log4j or java.util.logging. This is done because commons-logging is intended to be used by components (not unlike OSGI bundles/Eclipse plugins) that may be used in various environments unforseen by the component author, who cannot tell ahead of time what logging facilities will be available nor how they are configured.

However, commons-logging doesn't actually work in an OSGI environment like Eclipse, because commons-logging attempts to use classloader tricks in order to determine just which logging package it should delegate to. I don't know exactly what tricks it does use, but the common wisdom appears to be that these tricks do not work in OSGI - remember that each OSGI bundle has a different classloader and associated classpath; in general (not counting cases where you use Dynamic-ImportPackage or Eclipse-BuddyPolicy/Eclipse-RegisterBuddy manifest headers) classes from other bundles are only visible from your bundle if another bundle exports them and you either specify to import those packages via Import-Package or require the exporting bundle via Require-Bundle. Moreover, in OSGI environments, the classloader associated with Class X is the classloader of the bundle that exports class X; and in Eclipse specifically, the ContextClassLoader associated with each thread is dynamically set to be the Classloader of the bundle that called getContextClassloader(). (see http://wiki.eclipse.org/index.php/Context_Class_Loader_Enhancements)

The solution promoted by the authors of Spring DM (formerly known as Spring-OSGI) is to use the SLF4J logging library instead. SLF4J ("Simple Logging Facade for Java"), written by the author of Log4j, includes replacement org.apache.commons classes, but uses a different underlying mechanism that does not depend on the OSGI-incompatible classloader tricks. To use this replacement we need four plugins:

Plugins required to replace com.apache.commons.logging

  1. jcl104_over_slf4j:
    • Exports the *packages* org.apache.commons.logging and org.apache.commons.logging.impl
    • This is the SLF4J-provided plugin to use instead of the usual commons-logging plugin, named "org.apache.commons.logging", which exports the same packages. (See the note below about this issue.)
    • This plugin doesn't actually do logging itself, as it delegates the logging calls to the SLF4J API provided by the next package, slf4j.api
  2. slf4j.api:
    • Exports the packages org.slf4j, org.slf4j.helpers, and org.slf4j.spi
    • This plugin provides SLF4J's own API for logging (you call org.slf4j.LoggerFactory to get an org.slf4j.Logger object which you can log to). However, this is a pluggable API that can be implemented in various ways--it must import the package org.slf4j.impl in order to actually log.
  3. slf4j.log4j12:
    • This is one of several plugins that export alternative versions of the SLF4J implementation package org.slf4j.impl The idea is that the different plugins that export this package route logger calls to different underlying implementations -- this package routes logger calls to log4j version 1.2 in this case, but alternate versions link to log4j version 1.3 (which is slightly incompatible with 1.2 and is not officially a stable release), to java.util.logging, to the "Simple Logging" library, or even to Jakarta Commons Logging proper.
    • The particular version of org.slf4j.impl that is exported by this plugin binds the SLF4J LoggerFactory.getLogger() method to the log4j Logger.getLogger() method, and binds the SLF4J Logger logging methods to the Log4J logging methods. It does this in a static way, i.e., the different implementations of org.slf4j.impl available from different plugins are hard-coded to return different logger objects and logger factories according to the intended wiring.
  4. some plugin that provides log4j itself:
    • Should export org.apache.log4j and org.apache.log4j.spi (note there are a lot of other log4j packages that our code should not need -- and it's probably best not to export them, to prevent accidentally writing code that won't be compatible with Logback. I'll explain this below.)
    • Keep in mind that we need this log4j implementation because slf4j.log4j12 binds the SLF4J API to log4j, but doesn't provide an implementation of log4j

Changes needed to make SLF4J work in the Bioclipse target platform

  1. First, it's necessary to remove the "org.apache.commons.logging" *plugin* from the target platform. Currently, the plugin org.apache.commons.logging is included in Eclipse (at least the version I have on my desktop) and because, as I write this, the bioclipse target platform includes all plugins found in the main eclipse directory, Bioclipse presently includes two different packages that export org.apache.commons.logging. This confuses the Eclipse Equinox OSGI runtime, which can't know to wire your plugins to jcl104_over_slf4j instead of org.apache.commons.logging. (Actually, this is not quite the case; a bug in SLF4J 1.4.3 is that jcl104_over_slf4j actually fails to list a version in its Export-Package header, so Equinox assumes it exports version 0.0.0. Many of the plugins specify that at least version 1.0.4 must be imported, so Eclipse Equinox actually wires them org.apache.commons.logging, which is appropriately versioned. Version 1.5.0 of SLF4J fixes this particular problem, but of course doesn't eliminate the need to remove the org.apache.commons.logging plugin)
  2. Second, it's necessary to remove the slf4j.simple plugin from the target platform. This is a plugin that, like slf4j.log4j12, exports org.slf4j.impl. Just as with the commons-logging confusion noted above, this confuses Equinox, which doesn't know which plugin's org.slf4j.impl to wire the SLF4J API to. (A confusing point here is that slf4j.log4j12 would be disabled if it couldn't find a plugin exporting the Log4J API, but slf4j.simple doesn't end up so disabled, because it doesn't just wire the SLF4J API to an underlying logger--instead, it provides a trivial logging implementation of its own, which just writes to the console and doesn't check thresholds, etc.)
  3. Third, as implied in the above issue, we need to use version 1.5.0 of the SLF4J plugins, not the version 1.4.3 currently present in the target platform. 1.5.0 not only fixes the commons-logging versioning mentioned above, but also fixes the "PDE Export" bug you may have run across. There is an issue (possibly a bug) with Eclipse PDE's Export feature that causes it to refuse to build a product when two plugins in the product mutually depend on each other in the way that slf4j.api depends on slf4j.log4j12 while slf4j.log4j12 also depends on slf4j.api. The fix is to supply slf4j.log4j12 as a plugin *fragment* attached to slf4j.api, rather than as a separate plugin from slf4j.api. Version 1.5.0 of SLF4J fixes the build cycle problem by supplying slf4j.log4j12 as a fragment.

Possible eventual replacement of log4j with Logback (again via SLF4J)

It might make sense to switch to using the new Logback library instead of log4j for our underlying logging implementation. In particular, there are four reasons to consider using Logback eventually:

Reasons to consider Logback

  1. The main author of Logback is the author of log4j and SLF4J, and he intends logback to be the successor to log4j (Logback is nearly at 1.0 and provides many useful Appenders and Layouts)
  2. The Logback authors provide an Eclipse plugin that monitors your Logback log and provides two nice features: Clicking log lines open an Eclipse java editor to the line that called the logger; and clicking on logged Exceptions and Errors open the stack trace in the Eclipse Stack Trace view.
  3. Logback includes support for parameterized logging, which results in much faster logging calls - e.g., you can write something like:
    logger.debug("Called from {}: arguments were {} and {}", new Object[] {callingMethodName, arg1, arg2});
    which, if debug logging is turned off, avoids the five string concatenations and two toString() calls in the form you would otherwise have to write:
    logger.debug("Called from " + callingMethodName + ": arguments were" + arg1.toString() + " and " + arg2.toString());
    It should be noted that in Log4j 1.2, you can get the same speedup by rather tediously writing each logging statement in this form:
    if (logger.isDebugEnabled()) { logger.debug("Called from " + callingMethodName + ": arguments were" + arg1.toString() + " and " + arg2.toString()); }
  4. Logback allows file inclusion and parameter substitution in its configuration files. This may be a very useful feature for the replacement I propose for the net.bioclipse.logging plugin.

Reasons switching to Logback can be painless

  1. Logback implements the SLF4J API, so jcl104_over_slf4j would still work. (Logback implements the org.slf4j.Logger interface natively, rather than supplying an adaptor that translates calls to org.slf4j.Logger methods to some underlying implementation.)
  2. Once Logback is supplying the SLF4J implementation, you can use the SLF4J-supplied plugin that routes log4j api calls to Logback. So it would be easy to keep writing to the Log4j API while using Logback underneath, until a final decision is made whether to keep using Logback permanently (at which time you could switch to making calls directly to the Logback API--i.e., the SLF4J API--if you needed access to some of its more advanced features like parameterized logging and markers). The translation of log4j calls to Logback calls would work the same way as the translation of commons-logging calls to Logback calls, and the two adaptor plugins would work in parallel, providing the commons-logging api, the log4j api, and the native SLF4J api all at the same time:
org.apache.commons.logging packages (provided by jcl104_over_slf4j) ---\
                                                                       |
                                                                       +---> org.slf4j packages (provided jointly by slf4j.api  *AND*  qos.ch.logback) 	
                                                                       |                                                |                      ^ 
org.apache.log4j packages (provided by log4j_over_slf4j) --------------/                                                |                      |
                                                                                                                        +--> org.slf4j.impl ---+

                                                                                                              (the Logback-supplied version of org.slf4j.impl wires
                                                                                                               SLF4J's LoggerFactory to Logback's implementation,
                                                                                                               but omits an adaptor from org.slf4j.Logger to Logback,
                                                                                                               because Logback provides org.slf4j.Logger itself)

OSGI and Eclipse considerations

As I mentioned above, the mechanism of OSGI classloading create problems for the standard implementation of commons-logging, necessitating the use of SLF4J's version of commons-logging instead.

OSGI classloading creates another problem for logging systems that is also worth keeping in mind. Specifically, log4j (and logback) can be configured at runtime to use different Appenders and Layouts to handle output. Since this can be configured at runtime, the plugin providing logger code cannot be configured in advance to import the packages in which appenders and layouts are defined. In standard Java applications, this is not a problem, because log4j uses the equivalent of Class.forName("your.fully.qualified.appender.classname") to instantiate your appender and attach it to the logger hierarchy, meaning that it's merely enough that your appender lives on the global application classpath. But there is no equivalent of the global classpath in an OSGI environment, and if the logging plugin doesn't import the package or require the bundle that define the appender, its classloader cannot see them.

This is a well-known issue with respect to using logger plugins in Eclipse, and the standard workaround is to have the logger plugin contain a manifest header that declares "Eclipse-BuddyPolicy: registered". This casues the logging plugin's classloader to implement an Eclipse-specific "buddy classloading" policy in which, whenever the classloader fails to find a given class via the usual OSGI mechanism (see below), it looks for that class in all the bundles which have an "Eclipse-RegisterBuddy: <logging.plugin.name>" header in their manifest. (The log4j plugin net.bioclipse.target.platform/required-libs/log4j.osgi-1.2.13-SNAPSHOT.jar does have such an Eclipse-BuddyPolicy header.)

By the way, the standard OSGI classloading algorithm for a given bundle is:

  1. first: look for the class in all the bundles to whom you are wired via Import-Package;
  2. second: look for the class in all the bundles to whom you are wired via Require-Bundle;
  3. third, look for the class in your own bundle's classpath (i.e., within the bundle jar);
  4. fourth, if you have defined a Dynamic-Import header for that class, look for the class in all bundles.


Incompatibilities between typical logging library assumptions and Eclipse RCP applications

However, apart from mechanical, classloading issues in the OSGI environment, it's worth noting that there are some problems with using typical logging libraries such as log4j that arise because the assumptions under which logging libraries are designed clash with the conditions that hold in Eclipse RCP applications. Specifically, each different instance of a given Eclipse RCP application can consist of an unpredictable collection of plugins and features from widely varying sources. This leads to two notable problems:

  1. The configuration file for logging needs to be dynamically updated, on a per user basis, when various plugins are installed in their application. That said, in most cases, you ought to be able to get away with simply NOT configuring the different plugins, and just letting them inherit logging properites from the root logger. That way you can select logging outputs, formats, and a threshold for the whole application at once. But if plugins need to configure their own logging, they need to respect two important considerations:
    • That they live in a larger application, and should not reset the properties of the whole hierarchy of loggers, but only the ones they define
    • That they shouldn't override the logging threshold defined by user or developer of the whole application--or, to generalize that principle, they should only add to, but not override, configuration properties set higher in the logger hierarchy (how irritating would it have to be for a developer to have to reach into some recalcitrant plugin to reset its logging threshold each time she wants to turn debugging on or off?)
  2. The hierarchy of loggers created by many different plugins may not make any sense from the perspective of a developer wishing to have logically-related code be debugged together. Consider that certain plugins might depend critically on code belonging to packages that are not in the same package hierarchy; for example, net.bioclipse.cdk.* includes code in the org.openscience.cdk.* namespace. But if you're debugging net.bioclipse.cdk, you might like to have the properties of the org.openscience.cdk hierarchy inherit from the ones set for net.bioclipse.cdk.

Problems with com.tools.logging

I won't propose a solution here, except to point out that the solution provided by com.tools.logging (from which net.bioclipse.logging was adapted) probably isn't what we want. I do not think com.tools.logging was intended for RCP-type applications where many plugins work together to define a product. Instead, it seems to have been designed more for situations where wholly independent plugins are being developed; the com.tools.logging solution specifically **prevents** each plugin from inheriting the settings defined for the whole application!

The com.tools.logging architecture defines a separate, independent hierarchy of loggers for each plugin that uses its services. Apart from preventing any inheritance of logger configuration, this has the side effect that, unless you go out of your way to configure the "default" hierarchy of loggers (the one used by almost all legacy code, ie., the one implicitly used in calls to org.apache.log4j.Logger.getLogger()), the default hierarchy remains unconfigured, and logging output by all the legacy code in your application is lost.

I have some suggestions regarding a Better Logging Architecture for Bioclipse 2, but I'll write more about that later.

How to handle an error

If you are in a situation (meaning: your code is a situation) where you get an exception which you do not know how to handle, you can use the method handleException(Exception ex, Logger logger) in net.bioclipse.core.util.LogUtils. It takes your exception and a logger configured for that class and does the following:

  • Print stack trace to logger
  • Print stack trace to console
  • Display a nice message