Palisade API¶
Input modules¶
-
class
Palisade.
InputROOT
(files_spec=None)[source]¶ An input module for accessing objects from multiple ROOT files.
A nickname can be registered for each file, which then allows object retrieval by prefixing it to the object path (i.e.
<file_nickname>:<object_path_in_file>
).Single-file functionality is delegated to child
InputROOTFile
objects.- Parameters
files_spec (dict, optional) – specification of file nicknames (keys) and paths pointed to (values). Can be omitted (files can be added later via
add_file
)
Usage example:
m = InputROOT() # add a file and register a nickname for it m.add_file('/path/to/rootfile.root', nickname='file0') # optional: request object first (retrieves several objects at once) m.request(dict(file_nickname='file0', object_path='MyDirectory/myObject')) # retrieve an object from a file my_object = m.get('file0:MyDirectory/myObject') # apply simple arithmetical expressions to objects my_sum_object = m.get_expr('"file0:MyDirectory1/myObject1" + "file0:MyDirectory2/myObject2"') # use basic functions in expressions my_object_noerrors = m.get_expr('discard_errors("file0:MyDirectory1/myObject1")') # register user-defined input functions InputRoot.add_function(my_custom_function) # `my_custom_function` defined beforehand # use function in expression: my_function_result = m.get_expr('my_custom_function("file0:MyDirectory1/myObject1")')
-
classmethod
add_function
(function=None, name=None, override=False, memoize=False)[source]¶ Register a user-defined input function. Can also be used as a decorator.
Note
Functions are registered globally in the
InputROOT
class and are immediately available to allInputROOT
instances.- Parameters
function (function, optional) – function or method to add. Can be omitted when used as a decorator.
name (str, optional) – function name. If not given, taken from
function.__name__
override (bool, optional) – if
True
, allow existing functions to be overridden (default:False
)memoize (bool, optional) – if
True
, store function result in a cache on first call. For every subsequent call with identical arguments, the result will be retrieved from the cache instead of evaluating the function again. (default:False
)
Usage examples:
as a simple decorator:
@InputROOT.add_function def my_function(rootpy_object): ...
to override a function that has already been registered:
@InputROOT.add_function(override=True) def my_function(rootpy_object): ...
to register a function under a different name:
@InputROOT.add_function(name='short_name') def very_long_fuction_name_we_do_not_want_to_use_in_expressions(rootpy_object): ...
as a method:
InputROOT.add_function(my_function)
Note
All Palisade processors (and especially the
PlotProcessor
) expect the objects returned by functions to be valid rootpy objects. When implementing user-defined functions, make sure to convert “naked” ROOT (PyROOT) objects by wrapping them inrootpy.asrootpy
before returning them.
-
classmethod
get_function
(name)[source]¶ Retrieve a defined input function by name. Returns None if no such function exists.
-
add_file
(file_path, nickname=None)[source]¶ Add a ROOT file.
- Parameters
file_path (str) – path to ROOT file
nickname (str, optional) – file nickname. If not given,
file_path
will be used
-
get
(object_spec)[source]¶ Get an object from one of the registered files.
Tip
If calling
get
on multiple objects (e.g. in a loop), consider issuing arequest
call for all objects beforehand. The first call toget
will then retrieve all requested objects in one go, opening and closing the file only once.- Parameters
object_spec (str) – file nickname and path to object in ROOT file, separated by a colon, e.g.
"file_nickname:directory/object"
-
request
(request_specs)[source]¶ Request objects from the registered files. Requests for objects are stored until
get
is called for one of the objects. All requested objects are then be retrieved in one go and cached.- Parameters
request_specs (list of dict) – each dict represents a request for one object from one file.
A request dict must have either a key
object_spec
, which contains both the file nickname and the path to the object within the file (separated by a colon,:
), or two keysfile_nickname
andobject_path
specifying these separately.The following requests behave identically:
dict(file_nickname='file0', object_path="directory/object")
dict(object_spec="file0:directory/object")
-
get_expr
(expr, locals={})[source]¶ Evaluate an expression involving objects retrieved from file(s).
The string given must be a valid Python expression.
Strings contained in the expression are interpreted as specifications of objects in files (see
get
for the object specification syntax). Before the expression is evaluated, all strings are replaced by the objects they refer to.To interpret a string as a literal string (i.e. not referring to an object in a file), it must be wrapped inside the special function
str
.Any functions called in the expression must have been defined beforehand using
add_function
. There are a number of special functions, which behave as follows:str
: interpret a string as a literal stringno_input
: interpret all strings encountered anywhere inside the function call as literal stringsinput
: interpret all strings encountered anywhere inside the function call as specifications of objects in files
All Python identifiers used in the expression are interpreted as local variables. A map specifying the values of local variables for this call to get_expr can be given via the keyword argument locals.
Alternatively, local variables can be registered for use by all calls to
get_expr
by callingregister_local
beforehand. Variables given in the locals dictionary will take precedence over those defined viaregister_local
.Local variable lookup can be disabled completely by passing
locals=None
.- Parameters
expr (str) – valid Python expression
locals (dict or
None
(default:{}
)) –mapping of local variable names that may appear in expr to values.
These will override local variable values specified beforehand using
register_local
before calling this method.If
None
, local variable lookup is disabled and aNameError
will be raised if an identifier is encountered in the expression.
Usage examples:
my_result = my_input_root.get_expr( 'my_function(' # this function gets called '"my_file:path/to/my_object",' # this gets replaced by ROOT object '42' # this argument is passed literally ')' )
# register a local variable and assign it a value my_input_root.register_local('local_variable', 42) my_result = my_input_root.get_expr( 'my_function(' # this function gets called '"my_file:path/to/my_object",' # this gets replaced by ROOT object 'local_variable,' # this gets replaced by its assigned value ')' )
Tip
Writing expressions inside a single string can get very convoluted. To maintain legiblity, the expression string can be spread out on several lines, by taking advantage of Python’s automatic string concatenation inside parentheses (see above). Alternatively, triple-quoted strings can be used.
Processors¶
-
class
Palisade.
AnalyzeProcessor
(config, output_folder)[source]¶ Bases:
Karma.PostProcessing.Palisade.Processors._base._ProcessorBase
Processor for analyzing objects from ROOT files. The resulting objects are written to one or more output ROOT files.
Todo
API documentation
Initialize the processor.
- Parameters
config (dict) – processor configuration
output_folder (str) – directory in which to place the output files produced by this task
-
class
Palisade.
PlotProcessor
(config, output_folder)[source]¶ Bases:
Karma.PostProcessing.Palisade.Processors._base._ProcessorBase
Processor for plotting objects from ROOT files.
Todo
API documentation.
Initialize the processor.
- Parameters
config (dict) – processor configuration
output_folder (str) – directory in which to place the output files produced by this task
Context-dependent placeholders¶
A Palisade task configuration describes a workflow that involves accessing content from input files, processing it, and writing the result to output files. More often than not, the same workflow is applied to a series of inputs, specified by an expansion context. As a result, the concrete value of many configuration entries will depend on the particular context.
The need for context-dependent configurations is addressed by Palisade in two ways. The first involves performing string interpolation with the current context, while the second provides users with a series of dedicated placeholder objects to be used where a context-dependent value is needed.
Two high-level objects are provided for this purpose:
ContextValue
and InputValue
. The former resolves
to a concrete value provided in the current expansion context,
while the latter allows users to specify an expression for
retrieving objects from input files:
-
class
Palisade.
ContextValue
(spec)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
Configuration object. Is replaced by the value corresponding to the specification spec dispatched over the current context.
Default constructor: parse all arguments as field names and nodes
-
class
Palisade.
InputValue
(expression)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
Configuration object. It is replaced by an the result of evaluating expression as an expression involving input file objects. The expression is dispatched over the current expansion context.
See
get_expr
for more information about input expressions.Note
The context must provide an input controller (e.g.
InputROOT
) under the key_input_controller
.Default constructor: parse all arguments as field names and nodes
A powerful feature of the above objects is that they can be combined using arbitrarily complex expressions. A large portion of native Python expression syntax is supported, including arithmetical expressions and string formatting operations. However, not all Python syntax is supported. Consult the documentation ref:below <api-palisade-lazy-expressions> for an overview of supported operations and current limitations.
Expressions involving the above classes will store the entire expression syntax tree, allowing the values to be reconstructed at runtime, while also substituting context-dependent values from the current context.
Note
To disable string interpolation explicitly for a string,
it must be wrapped in the String
helper class.
Configuration helper classes¶
Context-dependent configuration entries are implemented using lazy evaluation techniques. A configuration entry whose value depends on an evaluation context cannot be initialized with a concrete value at configuration-time, so it is initialized in a lazy manner.
This means that it is initialized to a placeholder structure that contains all the necessary information to produce the concrete value, except the information contained in the evaluation context.
The placeholder is kept unevaluated until the evaluation context is available at run-time. Its concrete value is then be determined by dispatching it over the context.
The implementation of the context-dependent configuration entries is done using a series of helper classes. Each class represents a type of node in the abstract syntax tree of expressions involving context-dependent values.
All such nodes inherit from the LazyNodeBase
class.
-
class
Palisade.
LazyNodeBase
[source]¶ The abstract base class from which all lazy node objects must inherit.
Default constructor: parse all arguments as field names and nodes
In particular, the high-level placeholders
ContextValue
and
InputValue
inherit from
LazyNodeBase
, so this section
also applies to them.
Nodes that support iteration derive from the
LazyIterableNodeBase
class:
-
class
Palisade.
LazyIterableNodeBase
[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
Iterable nodes must provide an __iter__ method
Default constructor: parse all arguments as field names and nodes
Simple lazy nodes¶
The simplest lazy node is the Lazy
node, which is used to wrap a (typically non-lazy) value
value
:
-
class
Palisade.
Lazy
(value)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
A lazy container for a single value value. It will be replaced with the contained value on evaluation.
Note that
Lazy
behaves differently from other lazy containers in that it does not call the eval method of its contained object. IfLazy
is used to contain objects derived from lazy node, the eval method. must be called explicitly.Default constructor: parse all arguments as field names and nodes
Lazy
nodes stay unevaluated until their eval
method is called:
>>> a = Lazy('a')
>>> a
Lazy('a')
>>> a.eval()
'a'
Lazy
nodes can be used as lazy containers for other Lazy
nodes. In this case, the
eval
method must be called multiple times to
resolve to the contained value:
>>> a = Lazy(Lazy('a'))
>>> a
Lazy(Lazy('a'))
>>> a.eval()
Lazy('a')
>>> a.eval().eval()
'a'
Turning objects into lazy nodes¶
A non-lazy expression can be made lazy using the lazify
helper function:
Pretty-printing lazy nodes¶
As can be seen in the above note, lazy expressions can easily become large. To display a pretty-printed
structure of a lazy expression on multiple lines, the method pprint
is provided:
>>> l = [1, [2, {'3': (4, 5)}]]
>>> lazify(l).pprint()
List([
Lazy(1),
List([
Lazy(2),
Map({
Lazy('3') : List([
Lazy(4),
Lazy(5)
])
})
])
])
Lazy containers¶
Two types of lazy containers are provided: List
and Map
,
which will evaluate to lists and dictionaries, respectively.
-
class
Palisade.
List
(elts)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyIterableNodeBase
A lazy object that acts as a list-like container for the elements contained in elts.
The elements of elts should be lazy objects. They will be evaluated when the enclosing
List
is evaluated.Constructing a
List
from objects not derived from a lazy node class will cause these to be wrapped inside aLazy
container.Default constructor: parse all arguments as field names and nodes
-
class
Palisade.
Map
(mapping)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
A lazy object that acts as a dict-like container for the key-value pairs contained in mapping.
Both the keys and and values stored in mapping should be lazy objects. They will be evaluated when the enclosing
Map
is evaluated.Constructing a
Map
with keys or values not derived from a lazy node class will cause these to be wrapped inside aLazy
container.Default constructor: parse all arguments as field names and nodes
Lazy expressions¶
Lazy nodes can be used almost seamlessly in Python expressions. Expressions involving lazy nodes will be lazy nodes themselves and evaluating them will cause the nodes to be evaluated. The result is in most cases identical to what the equivalent non-lazy expression would return.
Supported expressions include:
basic arithmetical and logical operations:
>>> (Lazy(2) + Lazy(3)).eval() 5 >>> (Lazy(True) & Lazy(False)).eval() False
Warning
Logical operations must use the
&
and|
operators, not the Python keywordsand
andor
, which will give the wrong results.basic comparisons:
>>> (Lazy(2) < Lazy(3)).eval() True >>> (Lazy(2) > Lazy(3)).eval() False
Note
Multi-term comparisons do not work!
>>> (Lazy(3) < Lazy(4) < Lazy(1)).eval() # should be False True
As a workaround, expand these using only binary comparisons:
>>> (((Lazy(3) < Lazy(4)) & (Lazy(4) < Lazy(1)))).eval() False
string formatting
>>> String("{0}{0}").format(Lazy('a')).eval() 'aa'
Note
The string containing the template expression has to be wrapped inside a
String
. Using a regular string here will not work:>>> "{0}{0}".format(Lazy('a')).eval() AttributeError: 'str' object has no attribute 'eval'
function calls:
>>> Lazy(str)(2).eval() # function needs to be lazified '2'
object attribute access:
>>> my_object.my_attribute = 42 >>> Lazy(my_object).my_attribute.eval() 42
The following lazy nodes are used to represent the abstract syntax tree of the lazy expression:
-
class
Palisade.
Attribute
(obj, attr)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
A lazy object that represents access to the attribute attr of the object obj.
The object obj and the attribute attr should be lazy objects that evaluate to an object and a string, respectively. They will be evaluated when the Attribute is evaluated.
Default constructor: parse all arguments as field names and nodes
-
class
Palisade.
BinOp
(left, right, op)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
A lazy object that represents a binary operation op to be applied to the left and right operands.
The operands left and right should be lazy objects. They will be evaluated when the
BinOp
is evaluated.The operator op must be a callable that takes exactly two positional arguments.
Default constructor: parse all arguments as field names and nodes
-
class
Palisade.
Call
(func, args, kwargs)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
A lazy object that represents an invocation of a callable func with the positional arguments args and the keyword arguments kwargs.
The function func should be a lazy object that evaluates to a callable. It will be evaluated when the
Call
is evaluated.The arguments args and kwargs must be lazy objects that evaluate to a list and a dictionary, respectively. They will be evaluated when the
Call
is evaluated and will be passed to the evaluated function as unpacked positional and keyword arguments, respectively.Default constructor: parse all arguments as field names and nodes
-
class
Palisade.
FormatString
(template, context)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
A lazy object that represents a string template template together with a context context used to fill the template.
The template is a lazy object that evaluates to a string. It will be evaluated when the FormatString is evaluated.
The context must be a lazy object that evaluates to a dictionary that containd the keys ‘args’ and ‘kwargs’. The values corresponding to these keys will be evaluated when the
FormatString
is evaluated and will be passed to the format method of the evaluated template as unpacked positional and keyword arguments, respectively.Default constructor: parse all arguments as field names and nodes
-
class
Palisade.
Op
(operand, op)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
A lazy object that represents a unary operation op to be applied to the operand operand.
The operand operand should be a lazy object. It will be evaluated when the
Op
is evaluated.The operator op must be a callable that takes exactly one positional argument.
Default constructor: parse all arguments as field names and nodes
-
class
Palisade.
String
(s)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
A lazy object that represents a string s.
The passed object s should be a lazy object. It will be evaluated and passed through the built-in str method when the
String
is evaluated.Default constructor: parse all arguments as field names and nodes
-
format
(*args, **kwargs)[source]¶ Convert to a lazy
FormatString
using the supplied arguments as a context.
-
Lazy control flow structures¶
The use of lazy nodes is not limited to basic expressions. With the help of the following classes, control structures such as conditionals and exception handlers can be implemented using lazy nodes:
conditional expressions using
If
>>> If(Lazy(True), Lazy('true!'), Lazy('false!')).eval() 'true!' >>> If(Lazy(False), Lazy('yes!'), Lazy('no!')).eval() 'no!'
exception handling using
Try
>>> Lazy(len)(2).eval() TypeError: object of type 'int' has no len() >>> Try(Lazy(len)(2), TypeError, 'ERROR').eval() 'ERROR'
-
class
Palisade.
If
(condition, true_value, false_value)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
A lazy object that represents a conditional expression that evaluates to either true_value or false_value depending on the truth value of the condition condition.
The condition, true_value and false_value should be lazy objects. They will be evaluated when the If is evaluated.
The operator op must be a callable that takes exactly two positional arguments.
Default constructor: parse all arguments as field names and nodes
-
class
Palisade.
Try
(value, exception, value_on_exception)[source]¶ Bases:
Karma.PostProcessing.Palisade._lazy.LazyNodeBase
A lazy object that represents a try-except clause that attempts to evaluate to value. If an exception is thrown during evaluation, its type is it checked against exception. If the exception is an instance of exception, it is caught and value_on_exception is evaluated and returned, otherwise it is raised.
Default constructor: parse all arguments as field names and nodes
Built-in input functions¶
A number of common functions are already registered in
InputROOT
as “built-ins”. They are listed below.
Warning
The functionality of each of these functions is not stable and may change in the future. This list is provided for the sake of completeness only.
-
class
Palisade._input.
_ROOTObjectFunctions
[source]¶ -
-
static
histdivide
(tobject_1, tobject_2, option='')[source]¶ divide two histograms, taking error calculation option into account
-
static
max_yield_index
(yields, efficiencies, eff_threshold)[source]¶ for each bin, return index of object in yields which is maximizes yield, subject to the efficiency remaining above threshold
-
static
max_value_index
(tobjects)[source]¶ for each bin i, return index of object in tobjects which contains the largest value for bin i
-
static
select
(tobjects, indices)[source]¶ the content of each bin i in the return object is taken from the object whose index in tobjects is given by bin i in indices
-
static
mask_lookup_value
(tobject, tobject_lookup, lookup_value)[source]¶ bin i in return object is bin i in tobject if bin i in tobject_lookup is equal to lookup_value
-
static
apply_efficiency_correction
(tobject, efficiency, threshold=None)[source]¶ Divide each bin in tobject by the corresponding bin in efficiency. If efficiency is lower than threshold, the number of events is set to zero.
-
static
efficiency_graph
(tobject_numerator, tobject_denominator)[source]¶ Compute TEfficiency with proper clopper-pearson intervals
-
static
threshold
(tobject, min_value)[source]¶ returns a histogram like tobject with bins set to zero if they fall below the miminum value and to one if not. Errors are always set to zero
-
static
max_val_min_err
(*tobjects)[source]¶ binwise ‘max’ on value followed by a binwise ‘min’ on error.
-
static
mask_if_less
(tobject, tobject_ref)[source]¶ set tobject bins and their errors to zero if their content is less than the value in tobject_ref
-
static
double_profile
(tprofile_x, tprofile_y)[source]¶ creates a graph with points whose x and y values and errors are taken from the bins of two profiles with identical binning
-
static
threshold_by_ref
(tobject, tobject_ref)[source]¶ set tobject bins to zero if their content is less than the value in tobject_ref, and to 1 otherwise. Result bin errors are always set to zero.
-
static
normalize_x
(tobject)[source]¶ Normalize bin contents of each x slice of a TH2D by dividing by the y integral over each x slice.
-
static
unfold
(th1d_input, th2d_response, th1d_marginal_gen, th1d_marginal_reco)[source]¶ Use TUnfold to unfold a reconstructed spectrum.
- Parameters
th1d_input (ROOT.TH1D) – measured distribution to unfold
th2d_response (ROOT.TH2D) – 2D response histogram. Contains event numbers per (gen, reco) bin after rejecting spurious reconstructions and accounting for losses due to the reco acceptance. Gen bins should be on the x axis. Overflow/underflow should not be present and will be ignored! Acceptance losses and spurious reconstructions (“fakes”) are inferred from the difference between the projections of the response and the full marginal distributions, which are given separately.
th1d_marginal_gen (ROOT.TH1D) – marginal distribution on gen-level Contains event numbers per gen bin, without accounting for losses due to detector acceptance. The losses are inferred by comparing to the projection of the 2D response histogram, where these losses are accounted for.
th1d_marginal_reco (ROOT.TH1D) – marginal distribution on reco-level Contains event numbers per reco bin, without subtracting spurious reconstructions (“fakes”). The fakes are inferred by comparing to the projection of the 2D response histogram, where these fakes are not present.
-
static
normalize_to_ref
(tobject, tobject_ref)[source]¶ Normalize tobject to the integral over tobject_ref.
-
static
cumulate
(tobject)[source]¶ Make value of n-th bin equal to the sum of all bins up to and including n (but excluding underflow bins).
-
static
cumulate_reverse
(tobject)[source]¶ Make value of n-th bin equal to the sum of all bins from n up to and inclufing the last bin (but excluding overflow bins).
-
static