Reading and writing EDM4hep files in Gaudi
The facilities to read and write EDM4hep (or in general event data models based
on podio) are provided by k4FWCore
.
This page will describe their usage, but not go into too much details of their
internals. This page also assumes a certain familiarity with Gaudi, i.e. most of
the snippets just show a minimal configuration part, and not a complete runnable
example.
The k4DataSvc
Whenever you want to work with EDM4hep in the Gaudi based framework of Key4hep,
you will need to use the k4DataSvc
as EventDataSvc. You can instantiate and
configure this service like the following
from Gaudi.Configuration import *
from Configurables import k4DataSvc
evtSvc = k4DataSvc("EventDataSvc")
It is important that the name is EventDataSvc
in this case, as otherwise
this is an assumption from Gaudi. Once you have the k4DataSvc
instantiated,
you still have to make the ApplicationMgr
aware of it, by making sure that the
evtSvc
is in the list of the external services (ExtSvc
):
from Configurables import ApplicationMgr
ApplicationMgr(
# other args
ExtSvc = [evtSvc]
)
Reading events
To read events you will need to use the PodioInput
algorithm in addition to
the k4DataSvc
. Currently, you will need to pass the input
file to the k4DataSvc
via the input
option but pass the collections that you
want to read to the PodioInput
. We are working on making this (discussion
happens in this issue). The
parts of your options file related to reading EDM4hep files will look something
like this
from Configurables import PodioInput, k4DataSvc
evtSvc = k4DataSvc("EventDataSvc")
evtSvc.input = "/path/to/your/input-file.root"
podioInput = PodioInput()
It is possible to change the input file from the command line via
k4run <your-options-file> --EventDataSvc.input=<input-file>
By default the PodioInput
will read all collections that are available from
the input file. It is possible to limit the collections that should become
available via the collections
option
podioInput.collections = [
# List of collection names that should be made available
]
Writing events
To write events you will need to use the PodioOutput
algorithm in addition to
the k4DataSvc
:
from Configurables import PodioOutput
podioOutput = PodioOutput("PodioOutput", filename="my_output.root")
By default this will write the complete event contents to the output file.
Writing only a subset of collections
Sometimes it is desirable to limit the collections to a subset of all available
collections from the EventStore. The PodioOutput
allows to do this via the
outputCommands
option that takes a list of keep
or drop
commands. Each
command must consist of the keep
/drop
command and a target. The target is a
collection name that may include the ?
or *
wildcard patterns. This might
look like the following
podioOutput.outputCommands = ["keep *"]
which will keep everything (the default), while
podioOutput.outputCommands = ["drop *"]
will simply drop all collections and effectively write an empty file (apart from
some metadata). A common pattern is to "drop *"
and then selectively adding
keep
collections to keep, e.g. to only keep the highest level MC and reco
information:
podioOutput.outputCommands = [
"drop *",
"keep MCParticlesSkimmed",
"keep PandoraPFOs",
"keep RecoMCTruthLink",
]