Palisade API

Input modules

class Palisade.InputROOT(files_spec=None)[source]

An input module for accessing objects from multiple ROOT files.

A nickname can be registered for each file, which then allows object retrieval by prefixing it to the object path (i.e. <file_nickname>:<object_path_in_file>).

Single-file functionality is delegated to child InputROOTFile objects.

Parameters

files_spec (dict, optional) – specification of file nicknames (keys) and paths pointed to (values). Can be omitted (files can be added later via add_file)

Usage example:

m = InputROOT()

# add a file and register a nickname for it
m.add_file('/path/to/rootfile.root', nickname='file0')

# optional: request object first (retrieves several objects at once)
m.request(dict(file_nickname='file0', object_path='MyDirectory/myObject'))

# retrieve an object from a file
my_object = m.get('file0:MyDirectory/myObject')

# apply simple arithmetical expressions to objects
my_sum_object = m.get_expr('"file0:MyDirectory1/myObject1" + "file0:MyDirectory2/myObject2"')

# use basic functions in expressions
my_object_noerrors = m.get_expr('discard_errors("file0:MyDirectory1/myObject1")')

# register user-defined input functions
InputRoot.add_function(my_custom_function)  # `my_custom_function` defined beforehand

# use function in expression:
my_function_result = m.get_expr('my_custom_function("file0:MyDirectory1/myObject1")')
classmethod add_function(function=None, name=None, override=False, memoize=False)[source]

Register a user-defined input function. Can also be used as a decorator.

Note

Functions are registered globally in the InputROOT class and are immediately available to all InputROOT instances.

Parameters
  • function (function, optional) – function or method to add. Can be omitted when used as a decorator.

  • name (str, optional) – function name. If not given, taken from function.__name__

  • override (bool, optional) – if True, allow existing functions to be overridden (default: False)

  • memoize (bool, optional) – if True, store function result in a cache on first call. For every subsequent call with identical arguments, the result will be retrieved from the cache instead of evaluating the function again. (default: False)

Usage examples:

  • as a simple decorator:

    @InputROOT.add_function
    def my_function(rootpy_object):
        ...
    
  • to override a function that has already been registered:

    @InputROOT.add_function(override=True)
    def my_function(rootpy_object):
        ...
    
  • to register a function under a different name:

    @InputROOT.add_function(name='short_name')
    def very_long_fuction_name_we_do_not_want_to_use_in_expressions(rootpy_object):
            ...
    
  • as a method:

    InputROOT.add_function(my_function)
    

Note

All Palisade processors (and especially the PlotProcessor) expect the objects returned by functions to be valid rootpy objects. When implementing user-defined functions, make sure to convert “naked” ROOT (PyROOT) objects by wrapping them in rootpy.asrootpy before returning them.

classmethod get_function(name)[source]

Retrieve a defined input function by name. Returns None if no such function exists.

add_file(file_path, nickname=None)[source]

Add a ROOT file.

Parameters
  • file_path (str) – path to ROOT file

  • nickname (str, optional) – file nickname. If not given, file_path will be used

get(object_spec)[source]

Get an object from one of the registered files.

Tip

If calling get on multiple objects (e.g. in a loop), consider issuing a request call for all objects beforehand. The first call to get will then retrieve all requested objects in one go, opening and closing the file only once.

Parameters

object_spec (str) – file nickname and path to object in ROOT file, separated by a colon, e.g. "file_nickname:directory/object"

request(request_specs)[source]

Request objects from the registered files. Requests for objects are stored until get is called for one of the objects. All requested objects are then be retrieved in one go and cached.

Parameters

request_specs (list of dict) – each dict represents a request for one object from one file.

A request dict must have either a key object_spec, which contains both the file nickname and the path to the object within the file (separated by a colon, :), or two keys file_nickname and object_path specifying these separately.

The following requests behave identically:

  • dict(file_nickname='file0', object_path="directory/object")

  • dict(object_spec="file0:directory/object")

get_expr(expr, locals={})[source]

Evaluate an expression involving objects retrieved from file(s).

The string given must be a valid Python expression.

Strings contained in the expression are interpreted as specifications of objects in files (see get for the object specification syntax). Before the expression is evaluated, all strings are replaced by the objects they refer to.

To interpret a string as a literal string (i.e. not referring to an object in a file), it must be wrapped inside the special function str.

Any functions called in the expression must have been defined beforehand using add_function. There are a number of special functions, which behave as follows:

  • str: interpret a string as a literal string

  • no_input: interpret all strings encountered anywhere inside the function call as literal strings

  • input: interpret all strings encountered anywhere inside the function call as specifications of objects in files

All Python identifiers used in the expression are interpreted as local variables. A map specifying the values of local variables for this call to get_expr can be given via the keyword argument locals.

Alternatively, local variables can be registered for use by all calls to get_expr by calling register_local beforehand. Variables given in the locals dictionary will take precedence over those defined via register_local.

Local variable lookup can be disabled completely by passing locals=None.

Parameters
  • expr (str) – valid Python expression

  • locals (dict or None (default: {})) –

    mapping of local variable names that may appear in expr to values.

    These will override local variable values specified beforehand using register_local before calling this method.

    If None, local variable lookup is disabled and a NameError will be raised if an identifier is encountered in the expression.

Usage examples:

my_result = my_input_root.get_expr(
    'my_function('                      # this function gets called
        '"my_file:path/to/my_object",'  # this gets replaced by ROOT object
        '42'                            # this argument is passed literally
    ')'
)
# register a local variable and assign it a value
my_input_root.register_local('local_variable', 42)

my_result = my_input_root.get_expr(
    'my_function('                      # this function gets called
        '"my_file:path/to/my_object",'  # this gets replaced by ROOT object
        'local_variable,'               # this gets replaced by its assigned value
    ')'
)

Tip

Writing expressions inside a single string can get very convoluted. To maintain legiblity, the expression string can be spread out on several lines, by taking advantage of Python’s automatic string concatenation inside parentheses (see above). Alternatively, triple-quoted strings can be used.

register_local(name, value)[source]

Register a local variable to be used when evaluating an expression using get_expr

Parameters
  • name (str) – valid Python identifier. Must not have been registered before.

  • value – any Python object to be made accessible in expressions under name.

clear_locals()[source]

Clear all locals defined via register_local.

Processors

class Palisade.AnalyzeProcessor(config, output_folder)[source]

Bases: Karma.PostProcessing.Palisade.Processors._base._ProcessorBase

Processor for analyzing objects from ROOT files. The resulting objects are written to one or more output ROOT files.

Todo

API documentation

Initialize the processor.

Parameters
  • config (dict) – processor configuration

  • output_folder (str) – directory in which to place the output files produced by this task

class Palisade.PlotProcessor(config, output_folder)[source]

Bases: Karma.PostProcessing.Palisade.Processors._base._ProcessorBase

Processor for plotting objects from ROOT files.

Todo

API documentation.

Initialize the processor.

Parameters
  • config (dict) – processor configuration

  • output_folder (str) – directory in which to place the output files produced by this task

clear_figures()[source]

Close all figures created while running this processor.

class Palisade.Processors._base._ProcessorBase(config, output_folder)[source]

Abstract base class from which all processors inherit.

Initialize the processor.

Parameters
  • config (dict) – processor configuration

  • output_folder (str) – directory in which to place the output files produced by this task

run(show_progress=True)[source]

Run the processor.

Parameters

show_progress (bool) – if True, a progress bar will be shown

Context-dependent placeholders

A Palisade task configuration describes a workflow that involves accessing content from input files, processing it, and writing the result to output files. More often than not, the same workflow is applied to a series of inputs, specified by an expansion context. As a result, the concrete value of many configuration entries will depend on the particular context.

The need for context-dependent configurations is addressed by Palisade in two ways. The first involves performing string interpolation with the current context, while the second provides users with a series of dedicated placeholder objects to be used where a context-dependent value is needed.

Two high-level objects are provided for this purpose: ContextValue and InputValue. The former resolves to a concrete value provided in the current expansion context, while the latter allows users to specify an expression for retrieving objects from input files:

class Palisade.ContextValue(spec)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

Configuration object. Is replaced by the value corresponding to the specification spec dispatched over the current context.

Default constructor: parse all arguments as field names and nodes

eval(context)[source]

Evaluate this node with an optional context.

class Palisade.InputValue(expression)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

Configuration object. It is replaced by an the result of evaluating expression as an expression involving input file objects. The expression is dispatched over the current expansion context.

See get_expr for more information about input expressions.

Note

The context must provide an input controller (e.g. InputROOT) under the key _input_controller.

Default constructor: parse all arguments as field names and nodes

eval(context)[source]

Evaluate this node with an optional context.

A powerful feature of the above objects is that they can be combined using arbitrarily complex expressions. A large portion of native Python expression syntax is supported, including arithmetical expressions and string formatting operations. However, not all Python syntax is supported. Consult the documentation ref:below <api-palisade-lazy-expressions> for an overview of supported operations and current limitations.

Expressions involving the above classes will store the entire expression syntax tree, allowing the values to be reconstructed at runtime, while also substituting context-dependent values from the current context.

Note

To disable string interpolation explicitly for a string, it must be wrapped in the String helper class.

Configuration helper classes

Context-dependent configuration entries are implemented using lazy evaluation techniques. A configuration entry whose value depends on an evaluation context cannot be initialized with a concrete value at configuration-time, so it is initialized in a lazy manner.

This means that it is initialized to a placeholder structure that contains all the necessary information to produce the concrete value, except the information contained in the evaluation context.

The placeholder is kept unevaluated until the evaluation context is available at run-time. Its concrete value is then be determined by dispatching it over the context.

The implementation of the context-dependent configuration entries is done using a series of helper classes. Each class represents a type of node in the abstract syntax tree of expressions involving context-dependent values.

All such nodes inherit from the LazyNodeBase class.

class Palisade.LazyNodeBase[source]

The abstract base class from which all lazy node objects must inherit.

Default constructor: parse all arguments as field names and nodes

pprint()[source]

Pretty-print entire dependency tree for this node.

abstract eval(context=None)[source]

Evaluate this node with an optional context.

In particular, the high-level placeholders ContextValue and InputValue inherit from LazyNodeBase, so this section also applies to them.

Nodes that support iteration derive from the LazyIterableNodeBase class:

class Palisade.LazyIterableNodeBase[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

Iterable nodes must provide an __iter__ method

Default constructor: parse all arguments as field names and nodes

Simple lazy nodes

The simplest lazy node is the Lazy node, which is used to wrap a (typically non-lazy) value value:

class Palisade.Lazy(value)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

A lazy container for a single value value. It will be replaced with the contained value on evaluation.

Note that Lazy behaves differently from other lazy containers in that it does not call the eval method of its contained object. If Lazy is used to contain objects derived from lazy node, the eval method. must be called explicitly.

Default constructor: parse all arguments as field names and nodes

eval(context=None)[source]

Evaluate this node with an optional context.

Lazy nodes stay unevaluated until their eval method is called:

>>> a = Lazy('a')
>>> a
Lazy('a')
>>> a.eval()
'a'

Lazy nodes can be used as lazy containers for other Lazy nodes. In this case, the eval method must be called multiple times to resolve to the contained value:

>>> a = Lazy(Lazy('a'))
>>> a
Lazy(Lazy('a'))
>>> a.eval()
Lazy('a')
>>> a.eval().eval()
'a'

Turning objects into lazy nodes

A non-lazy expression can be made lazy using the lazify helper function:

Palisade.lazify(obj)[source]

Convert an object into a lazy version of the object.

Returns lazy node objects unchanged. Converts tuples and list into List objects and dicts into Map objects. Wraps everything else in a Lazy container.

Pretty-printing lazy nodes

As can be seen in the above note, lazy expressions can easily become large. To display a pretty-printed structure of a lazy expression on multiple lines, the method pprint is provided:

>>> l = [1, [2, {'3': (4, 5)}]]
>>> lazify(l).pprint()
List([
  Lazy(1),
  List([
    Lazy(2),
    Map({
      Lazy('3') : List([
        Lazy(4),
        Lazy(5)
      ])
    })
  ])
])

Lazy containers

Two types of lazy containers are provided: List and Map, which will evaluate to lists and dictionaries, respectively.

class Palisade.List(elts)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyIterableNodeBase

A lazy object that acts as a list-like container for the elements contained in elts.

The elements of elts should be lazy objects. They will be evaluated when the enclosing List is evaluated.

Constructing a List from objects not derived from a lazy node class will cause these to be wrapped inside a Lazy container.

Default constructor: parse all arguments as field names and nodes

eval(context=None)[source]

Evaluate this node with an optional context.

class Palisade.Map(mapping)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

A lazy object that acts as a dict-like container for the key-value pairs contained in mapping.

Both the keys and and values stored in mapping should be lazy objects. They will be evaluated when the enclosing Map is evaluated.

Constructing a Map with keys or values not derived from a lazy node class will cause these to be wrapped inside a Lazy container.

Default constructor: parse all arguments as field names and nodes

eval(context=None)[source]

Evaluate this node with an optional context.

Lazy expressions

Lazy nodes can be used almost seamlessly in Python expressions. Expressions involving lazy nodes will be lazy nodes themselves and evaluating them will cause the nodes to be evaluated. The result is in most cases identical to what the equivalent non-lazy expression would return.

Supported expressions include:

  • basic arithmetical and logical operations:

    >>> (Lazy(2) + Lazy(3)).eval()
    5
    
    >>> (Lazy(True) & Lazy(False)).eval()
    False
    

    Warning

    Logical operations must use the & and | operators, not the Python keywords and and or, which will give the wrong results.

  • basic comparisons:

    >>> (Lazy(2) < Lazy(3)).eval()
    True
    
    >>> (Lazy(2) > Lazy(3)).eval()
    False
    

    Note

    Multi-term comparisons do not work!

    >>> (Lazy(3) < Lazy(4) < Lazy(1)).eval()  # should be False
    True
    

    As a workaround, expand these using only binary comparisons:

    >>> (((Lazy(3) < Lazy(4)) & (Lazy(4) < Lazy(1)))).eval()
    False
    
  • string formatting

    >>> String("{0}{0}").format(Lazy('a')).eval()
    'aa'
    

    Note

    The string containing the template expression has to be wrapped inside a String. Using a regular string here will not work:

    >>> "{0}{0}".format(Lazy('a')).eval()
    AttributeError: 'str' object has no attribute 'eval'
    
  • function calls:

    >>> Lazy(str)(2).eval()  # function needs to be lazified
    '2'
    
  • object attribute access:

    >>> my_object.my_attribute = 42
    >>> Lazy(my_object).my_attribute.eval()
    42
    

The following lazy nodes are used to represent the abstract syntax tree of the lazy expression:

class Palisade.Attribute(obj, attr)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

A lazy object that represents access to the attribute attr of the object obj.

The object obj and the attribute attr should be lazy objects that evaluate to an object and a string, respectively. They will be evaluated when the Attribute is evaluated.

Default constructor: parse all arguments as field names and nodes

eval(context=None)[source]

Evaluate this node with an optional context.

class Palisade.BinOp(left, right, op)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

A lazy object that represents a binary operation op to be applied to the left and right operands.

The operands left and right should be lazy objects. They will be evaluated when the BinOp is evaluated.

The operator op must be a callable that takes exactly two positional arguments.

Default constructor: parse all arguments as field names and nodes

eval(context=None)[source]

Evaluate this node with an optional context.

class Palisade.Call(func, args, kwargs)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

A lazy object that represents an invocation of a callable func with the positional arguments args and the keyword arguments kwargs.

The function func should be a lazy object that evaluates to a callable. It will be evaluated when the Call is evaluated.

The arguments args and kwargs must be lazy objects that evaluate to a list and a dictionary, respectively. They will be evaluated when the Call is evaluated and will be passed to the evaluated function as unpacked positional and keyword arguments, respectively.

Default constructor: parse all arguments as field names and nodes

eval(context=None)[source]

Evaluate this node with an optional context.

class Palisade.FormatString(template, context)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

A lazy object that represents a string template template together with a context context used to fill the template.

The template is a lazy object that evaluates to a string. It will be evaluated when the FormatString is evaluated.

The context must be a lazy object that evaluates to a dictionary that containd the keys ‘args’ and ‘kwargs’. The values corresponding to these keys will be evaluated when the FormatString is evaluated and will be passed to the format method of the evaluated template as unpacked positional and keyword arguments, respectively.

Default constructor: parse all arguments as field names and nodes

eval(context=None)[source]

Evaluate this node with an optional context.

class Palisade.Op(operand, op)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

A lazy object that represents a unary operation op to be applied to the operand operand.

The operand operand should be a lazy object. It will be evaluated when the Op is evaluated.

The operator op must be a callable that takes exactly one positional argument.

Default constructor: parse all arguments as field names and nodes

eval(context=None)[source]

Evaluate this node with an optional context.

class Palisade.String(s)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

A lazy object that represents a string s.

The passed object s should be a lazy object. It will be evaluated and passed through the built-in str method when the String is evaluated.

Default constructor: parse all arguments as field names and nodes

eval(context=None)[source]

Evaluate this node with an optional context.

format(*args, **kwargs)[source]

Convert to a lazy FormatString using the supplied arguments as a context.

Lazy control flow structures

The use of lazy nodes is not limited to basic expressions. With the help of the following classes, control structures such as conditionals and exception handlers can be implemented using lazy nodes:

  • conditional expressions using If

    >>> If(Lazy(True), Lazy('true!'), Lazy('false!')).eval()
    'true!'
    >>> If(Lazy(False), Lazy('yes!'), Lazy('no!')).eval()
    'no!'
    
  • exception handling using Try

    >>> Lazy(len)(2).eval()
    TypeError: object of type 'int' has no len()
    >>> Try(Lazy(len)(2), TypeError, 'ERROR').eval()
    'ERROR'
    
class Palisade.If(condition, true_value, false_value)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

A lazy object that represents a conditional expression that evaluates to either true_value or false_value depending on the truth value of the condition condition.

The condition, true_value and false_value should be lazy objects. They will be evaluated when the If is evaluated.

The operator op must be a callable that takes exactly two positional arguments.

Default constructor: parse all arguments as field names and nodes

eval(context=None)[source]

Evaluate this node with an optional context.

class Palisade.Try(value, exception, value_on_exception)[source]

Bases: Karma.PostProcessing.Palisade._lazy.LazyNodeBase

A lazy object that represents a try-except clause that attempts to evaluate to value. If an exception is thrown during evaluation, its type is it checked against exception. If the exception is an instance of exception, it is caught and value_on_exception is evaluated and returned, otherwise it is raised.

Default constructor: parse all arguments as field names and nodes

eval(context=None)[source]

Evaluate this node with an optional context.

Built-in input functions

A number of common functions are already registered in InputROOT as “built-ins”. They are listed below.

Warning

The functionality of each of these functions is not stable and may change in the future. This list is provided for the sake of completeness only.

class Palisade._input._ROOTObjectFunctions[source]
static hist(tobject)[source]

Turn a profile in to a histogram.

static histdivide(tobject_1, tobject_2, option='')[source]

divide two histograms, taking error calculation option into account

static max_yield_index(yields, efficiencies, eff_threshold)[source]

for each bin, return index of object in yields which is maximizes yield, subject to the efficiency remaining above threshold

static max_value_index(tobjects)[source]

for each bin i, return index of object in tobjects which contains the largest value for bin i

static select(tobjects, indices)[source]

the content of each bin i in the return object is taken from the object whose index in tobjects is given by bin i in indices

static mask_lookup_value(tobject, tobject_lookup, lookup_value)[source]

bin i in return object is bin i in tobject if bin i in tobject_lookup is equal to lookup_value

static apply_efficiency_correction(tobject, efficiency, threshold=None)[source]

Divide each bin in tobject by the corresponding bin in efficiency. If efficiency is lower than threshold, the number of events is set to zero.

static efficiency(tobject_numerator, tobject_denominator)[source]

Compute TEfficiency

static efficiency_graph(tobject_numerator, tobject_denominator)[source]

Compute TEfficiency with proper clopper-pearson intervals

static project_x(tobject)[source]

Apply ProjectionX() operation.

static project_y(tobject)[source]

Apply ProjectionY() operation.

static diagonal(th2d)[source]

Return a TH1D containing the main diagonal of an input TH2D.

static yerr(tobject)[source]

replace bin value with bin error and set bin error to zero

static atleast(tobject, min_value)[source]

mask all values below threshold

static threshold(tobject, min_value)[source]

returns a histogram like tobject with bins set to zero if they fall below the miminum value and to one if not. Errors are always set to zero

static discard_errors(tobject)[source]

set all bin errors to zero

static bin_width(tobject)[source]

replace bin value with width of bin and set bin error to zero

static max(*tobjects)[source]

binwise max for a collection of histograms with identical binning

static max_val_min_err(*tobjects)[source]

binwise ‘max’ on value followed by a binwise ‘min’ on error.

static mask_if_less(tobject, tobject_ref)[source]

set tobject bins and their errors to zero if their content is less than the value in tobject_ref

static double_profile(tprofile_x, tprofile_y)[source]

creates a graph with points whose x and y values and errors are taken from the bins of two profiles with identical binning

static threshold_by_ref(tobject, tobject_ref)[source]

set tobject bins to zero if their content is less than the value in tobject_ref, and to 1 otherwise. Result bin errors are always set to zero.

static normalize_x(tobject)[source]

Normalize bin contents of each x slice of a TH2D by dividing by the y integral over each x slice.

static unfold(th1d_input, th2d_response, th1d_marginal_gen, th1d_marginal_reco)[source]

Use TUnfold to unfold a reconstructed spectrum.

Parameters
  • th1d_input (ROOT.TH1D) – measured distribution to unfold

  • th2d_response (ROOT.TH2D) – 2D response histogram. Contains event numbers per (gen, reco) bin after rejecting spurious reconstructions and accounting for losses due to the reco acceptance. Gen bins should be on the x axis. Overflow/underflow should not be present and will be ignored! Acceptance losses and spurious reconstructions (“fakes”) are inferred from the difference between the projections of the response and the full marginal distributions, which are given separately.

  • th1d_marginal_gen (ROOT.TH1D) – marginal distribution on gen-level Contains event numbers per gen bin, without accounting for losses due to detector acceptance. The losses are inferred by comparing to the projection of the 2D response histogram, where these losses are accounted for.

  • th1d_marginal_reco (ROOT.TH1D) – marginal distribution on reco-level Contains event numbers per reco bin, without subtracting spurious reconstructions (“fakes”). The fakes are inferred by comparing to the projection of the 2D response histogram, where these fakes are not present.

static normalize_to_ref(tobject, tobject_ref)[source]

Normalize tobject to the integral over tobject_ref.

static cumulate(tobject)[source]

Make value of n-th bin equal to the sum of all bins up to and including n (but excluding underflow bins).

static cumulate_reverse(tobject)[source]

Make value of n-th bin equal to the sum of all bins from n up to and inclufing the last bin (but excluding overflow bins).

static bin_differences(tobject)[source]

Make value of n-th bin equal to the difference between the n-th and (n-1)-th bins.

static bin_ratios(tobject)[source]

Make value of n-th bin equal to the ratio between the n-th and (n-1)-th bins.