Lumberjack is a tool for processing large amounts of columnar data stored in ROOT
objects. For analysis purposes, these data are typically processed to obtain simpler
analysis-level objects such as histograms or profile histograms (“profiles” for short).
Consider the following typical workflow:
apply arbitrary filters to the data
split a dataset into several (possibly overlapping) regions
bin the data (possibly in more than one dimension) to produce histograms and/or profiles
In principle, this is achievable by using the interface provided by the
but this implies writing a lot of “boilerplate” C++ and/or Python code, which can be tedious
and error-prone. In addition, this approach often results in generic code (e.g. the event loop)
being mixed with analysis-specific code and metadata (binnings, threshold values for filters, etc.),
which can prove difficult to debug and maintain.
Lumberjack aims to provide users with a simple but powerful interface for configuring and running
such a workflow. It uses the
RDataFrame interface in ROOT for fast multi-threaded processing
TTrees and outputs the resulting analysis-level objects to a ROOT file in an intuitive
Quantity(name, expression, binning, named_binnings=None)¶
The default binning for this quantity. Always defined, even if no special “named binnings” are not.
Named binnings defined for this quantity.
Create a clone of quantity, replacing properties as specified by the keyword arguments.
Splitting keys for which named binnings have been defined for this quantity.
iter_bins(indices=slice(None, None, None))¶
Generator function. Yields tuple (lo, hi) of bin edges for each bin. A subset may be obtained by providing a list of indices or a slice object.
Retrieve a binning (if defined) for the case where the splitting key key has value value. If none defined, return None.