This lesson is still being designed and assembled (Pre-Alpha version)

CMSDAS CERN 2020: B2G Long Exercise: Glossary

Key Points

Wednesday DAS Plenary 10-11h	Here is a link to the indico page for Wednesday’s plenary.
Introduction to All-Hadronic b*->tW and Useful Tools	First level of help is at peer-level. The facilitators are here to help for any issues you are unsure about. The big picture is: analyses begin with a motivating final state, then we optimize some selections to keep signal and reject background, finally we consider all systematic uncertainties associated with our selections and modeling in order to quantify the observations we make. Each stage of this process needs dedicated studies and may be performed individually or as a larger team. The LHC and experiments are preparing for Run 3 and HL-LHC; we are exploring models that are motivated by exclusions/measurements made by Run 2 observations. Eliminate fear-of-losing-data by using version control! Commit often, and document, document, document. Save yourself time later by setting CI/CD tests.
Friday DAS Plenary 10-11h	Here is a link to the indico page for Friday’s plenary.
Diving into jet substructure	Jet substructure are analysis techniques for measuring a jet observable through its constituent information. N-subjetiness is how ‘N-pronged’ a jet looks, more specifically for N subjets it is the sum of pt-weighted constuent-subjet spatial moments. Traditional top-tagging typically uses τ₃₂ and jet mass, whereas for W-tagging it’s τ₂₁ and the jet mass.
Investigating the signal topology	In an all-hadronic state, we veto leptons (electrons/muons) and study jet properties. Jets are clustered Particle Flow candidates interpreted as a 4-vector with momenta, energy, and mass. The resonance searched decays to all particles in final state, thus if we add all jets in vector-form we can reconstruct the resonance.
Friday Lunch 12-13h
Developing a pre-selection	Preselection reduces data size, but further signal optimization is done later Preselected events should be in good regions of the detector with appropriate filters Stacked histograms are an important tool for creating cuts
Monday DAS Plenary 10-11h
Optimising the analysis	Preselection is not enough to be sensitive to signal, we need to tighten selection to increase significance of signal to background. The traditional way of optimizing selections is to apply N-1 (all-but one) cuts and findng peak of significance curve in removed cut. Boosted Decision Trees and other multivariate optimization techniques are also widely used.
Monday Lunch 12-13h
Controlling the background	The backgrounds involved are processes producing jets with large pT that look like t+W. All-hadronic ttbar can look like t+W if one of tops has a very soft (or untagged) b-jet. MC is reasonably modeled Singletop production is sub-dominant, but irreducible. MC is reasonably modeled LHC is a hadron collider, multi-jet production (QCD) is almost always a background in all hadronic analyses. W/Z+Jets behaves a lot like multi-jet production since the top-tag it passes to make it into out signal selection comes from combinatoric combination of jets. Control regions are used to study specific backgrounds in a kinematic region orthogonal to the signal regions; we test our background estimation in CRs and apply corrections needed there to the SR. Validation regions are designed to lie orthgonal but between CRs and SRs, perhaps with lower target BG purity, to test the corrections extracted from the CR that will be applied to SRs. Blinding is the convention of not looking at data in the SR (or any signal-enriched selection).
Background Estimation: 2DAlphabet	We can simultaneously fit several backgrounds by defining the appropriate observables/axes, i.e. m_t vs m_tW. 2D Alphabet is a method of fitting a polynomial function to a space (except the SR), to model the multi-jet background, and also serves as an interfae to other tools.
Tuesday DAS Plenary 10-11h
Modelling the background and signal with 2DAlphabet (a wrapper for Combine)	Luminosity uncertainties affect all processes equally Cross section uncertainties should not be applied to signal Template morphing is used to determine shape uncertainties
Tuesday Lunch 12-13h
Obtaining upper cross section limits	We can define expected sensitivity of an analysis with simulation-only by assuming the data expected will be exactly what the background estimation predicts. We combine the systematic variations as a 1/2-sigma envelope around nominal expectation to quantify the confidence of our measurement. After unblinding, if we have a down-fluctuation of data the expected exclusion appears stronger and vice-versa with up-fluctuations and apparently weak exclusions.
Bonus: Calibrating a jet subtructure tagger.	We can define tagger scale factors by measuring the efficiency of the tagger in t+W enriched regions.
DAS Presentations

Glossary

FIXME