Lumberjack API

Lumberjack is a tool for processing large amounts of columnar data stored in ROOT TTree objects. For analysis purposes, these data are typically processed to obtain simpler analysis-level objects such as histograms or profile histograms (“profiles” for short).

Consider the following typical workflow:

  1. apply arbitrary filters to the data

  2. split a dataset into several (possibly overlapping) regions

  3. bin the data (possibly in more than one dimension) to produce histograms and/or profiles

In principle, this is achievable by using the interface provided by the TTree directly, but this implies writing a lot of “boilerplate” C++ and/or Python code, which can be tedious and error-prone. In addition, this approach often results in generic code (e.g. the event loop) being mixed with analysis-specific code and metadata (binnings, threshold values for filters, etc.), which can prove difficult to debug and maintain.

Lumberjack aims to provide users with a simple but powerful interface for configuring and running such a workflow. It uses the RDataFrame interface in ROOT for fast multi-threaded processing of TTrees and outputs the resulting analysis-level objects to a ROOT file in an intuitive structure.

Configuration tools

class Lumberjack.Quantity(name, expression, binning, named_binnings=None)[source]

Bases: object

property binning

The default binning for this quantity. Always defined, even if no special “named binnings” are not.

property named_binnings

Named binnings defined for this quantity.

clone(**kwargs)[source]

Create a clone of quantity, replacing properties as specified by the keyword arguments.

property named_binning_keys

Splitting keys for which named binnings have been defined for this quantity.

iter_bins(indices=slice(None, None, None))[source]

Generator function. Yields tuple (lo, hi) of bin edges for each bin. A subset may be obtained by providing a list of indices or a slice object.

get_named_binning(key, value)[source]

Retrieve a binning (if defined) for the case where the splitting key key has value value. If none defined, return None.