Efficient analysis with ROOT: Glossary

Key Points

Introduction
  • You’ll learn how to install ROOT on your system or get access to systems with ROOT pre-installed!

  • You’ll learn how to use ROOT in C++ and Python!

  • You’ll learn about commonly used features in ROOT!

  • You’ll learn how to get help with ROOT!

  • You’ll learn to do efficient data analysis with an example based on NanoAOD files!

Get ROOT
ROOT in C++ and Python
  • The choice of interactive C++, compiled C++ or Python is based on the use case!

  • Usage of C++ code, compiled with optimization flags, may save you hours of computing time!

  • PyROOT lets you use C++ from Python but offers many more advanced features to speed up your analysis in Python. Details about the dynamic Python bindings provided by PyROOT can be found on https://root.cern/manual/python.

Commonly used features in ROOT
  • ROOT provides many features from histogramming, fitting and plotting to investigating data interactively in C++ and Python

Efficient analysis with RDataFrame
  • RDataFrame is the recommended entry point for efficient analysis

  • RDataFrame is lazy: declare first what you want to do and let ROOT run all of your tasks as efficiently as possible in one go, in parallel!

  • Parallelization on multiple threads requires only the ROOT.EnableImplicitMT() statement

How to get help with ROOT?
  • User support is an integral part of ROOT!

  • https://root.cern is the entry point to find all documentation

  • The reference guide provides in-depth technical documentation, but also additional explanation for classes and a huge amount of tutorials explaining features with code

  • The ROOT forum is actively maintained by the ROOT team to support you!

NanoAOD analysis: Introduction
  • Analysis studies Higgs boson decays to two tau leptons with a muon and a hadronic tau in the final state

  • The input files are (reduced) CMS NanoAOD, being very close to actual analysis in CMS

  • The following steps will show in a hands-on the use of RDataFrame in an actual analysis

NanoAOD analysis: Skim the initial datasets
  • We reduced the initial datasets by filtering suitable events and selecting interesting observables.

  • This step includes finding the interesting muon-tau pair in each selected event.

  • To perform this computationally expensive part of the analysis as efficiently as possible, we enable ROOT’s implicit multi-threading and use RDataFrame in C++!

  • ROOT::RVec is an extended std::vector, which provides features to deal easily with collections similar to NumPy arrays in Python.

NanoAOD analysis: Produce histograms
  • We produce histograms of all physics processes and all observables.

  • All histograms are produced in a signal region with opposite-signed muon-tau pairs and in a control region with same-signed pairs for the data-driven QCD estimate

  • This step shows the usage of RDataFrame in Python producing a large number of histograms in a single event loop and in parallel!

NanoAOD analysis: Make the plots
  • The plotting combines all histograms to produce estimates of the physical processes and create a figure with a physical meaning.

  • The plots show the share of the contributing physical processes to the data, but without systematic uncertainties.

  • The script shows how you can produce paper quality plots with ROOT!

Glossary