Versatile physicist adept at uncovering crucial insights from data through advanced analytics, specializing in machine learning and statistics. Demonstrated leadership in diverse projects within influential CERN collaborations, contributing significantly to impactful published research.
Recent Projects
Jet classification
This project Apply deep neural networks common in computer vision applications to distinguish different sources of jets using “jet images”
Physics with jets essential for the success of the LHC physics program. Jet clustering combines calorimeter deposits or tracks in an attempt to relate observations with theoretical predictions
Large effort in both Experiment and Theory communities to improve/extend jet tools
Major role in these developments: Advanced ML techniques Started with jet flavour tagging, showing impressive improvement in performance
read more
Observation of tZq using Machine Learning.
The tZq signal is observed with a significance well over five standard deviations. Machine Learning algorithm, gradient boosted decision trees (BDTs) are set up to maximally discriminate between prompt and non-prompt leptons. The BDTs exploit the properties of the jet closest to the lepton in terms of ∆R The measured tZq production cross section is in agreement with the standard model expectation. Python implementation of ROOT TMVA (cern machine learning and data analysis platform) used.
read more
Particle Identification using Deep Learning.
Identification of heavy, energetic, hadronically decaying particles using machine-learning techniques Used. Refined boosted top-tagging technique using Deep Learning discrimination versus QCD background. Applied Computer Vision solutions for the identification of new Physics events from data in multi-channel. Refined boosted top-tagging technique using Deep Learning discrimination versus background Evaluate and compare the performances of a variety of algorithms (“taggers”) designed to distinguish hadronically decaying massive Link to github
read more
Photon Identification using CNN
This project focuses on building a machine learning pipeline for a high-energy physics particle classifier using Apache Spark, ROOT, Parquet, TensorFlow, and Jupyter with Python notebooks.
The data are Monte Carlo simulation produced using ZllyAthDerivation and processed using NTUP code to produce root flat ntuple which are then converted to NumPy arrays using Array code. CellsToImage is a code which converts the NumPy cells vector to NumPy images for training, An example of training images is shown below:
read more
Real time event selection at LHC with Spark and Deep Learning
This project focuses on building a machine learning pipeline for a high-energy physics particle classifier using Apache Spark, ROOT, Parquet, TensorFlow, and Jupyter with Python notebooks.
The training of DL models has yielded satisfactory results that align with the findings of the original research paper. The performance of the models can be evaluated through various metrics, including loss convergence, ROC curves, and AUC (Area Under the Curve) analysis.
By achieving results consistent with the original research paper, we validate the effectiveness of our DL models and the reliability of our implementation.
read more
Search for tZq Using BDT
A search for the associated production of a top quark and a Z boson (tZq), as predicted by theory was performed with the full CMS data set collected at 8TeV. In order to enhance the separation between signal and background processes, a multivariate discriminator is used in both the tZq-SM and FCNC searches. A range of different quantities are used as input variables for the BDTs. They are selected based on their discriminating power and include kinematic variables related to the top quark and the Z boson The BDTs are trained using half of the simulated samples for these processes and they are trained separately for each channel.
read more
Tableau Data Visualization Pakistan Election Data 2008-2022
The Election Day results are presented with data gathered from ECP website The last three Elections of National Assembly and All four provincial Assemblies are covered in the Dashboard. Link to Tableau public
read more
tZq using Boosted Decision Trees.
Two multivariate discriminators, based on observables from the 1bjet and 2bjets regions, are used to enhance the separation be- tween signal and background processes.
The discriminators are based on the BDT algorithm implemented in the toolkit for multivariate analysis TMVA. The BDT is trained using the simulated samples.
The predictions for some of the most discriminating variables in the BDT for the 1bjet and 2bjets regions are compared to data
read more
Using Variational Autoenoder on particle collision data
Particles, in this case protons, are boosted to high energies inside the Large Hadron Collider (LHC) — each beam can reach 6.5 TeV giving a total of 13 TeV when colliding. Electromagnetic fields are used to accelerate the electrically charged protons in a 27 kilometers long loop. When the proton beams collide they produce a diverse set of subatomic byproducts which quickly decay, holding valuable information for some of the most fundamental questions in physics.
read more