Sijo Arakkal Peious, the PhD student of the Department of Software Science, will defend his PhD thesis “Measures of Impact and Confounding – an Analysis and Experimental Comparison of Novel and Established Measures” (”Mõju ja segajate mõõdikud – uute ja väljakujunenud meetmete analüüs ja eksperimentaalne võrdlus”), on May 9, 2025 starting at 15:00 (UTC +3 / Eastern European Time (EET)). The defense will take place in room U05-105 (TalTech Study Building 5, Ehitajate tee 5, Tallinn) and can be also followed via Zoom.
The PhD thesis "Measures of Impact and Confounding – an Analysis and Experimental Comparison of Novel and Established Measures" offers a novel method and surprising findings that could reshape how data scientists and researchers approach data analysis. Confounding, a phenomenon where a third variable distorts the relationship between an exposure and an outcome, is a critical challenge in both experimental and observational studies. Despite its widespread impact, confounding has often been overlooked in data mining tools, leaving a gap in high-quality quantitative analysis.
The author, Sijo Arakkal Peious systematically compares four methods for detecting confounders, i.e. the Ad-Hoc method, Oaxaca-Blinder decomposition, the linear-regression-based method, and a novel approach called Coupled Impact Assessment (C-IA). Surprisingly, the results revealed no significant agreement among these methods. This unexpected result highlights the complexity of confounding and underscores the need for more robust, unified approaches.
The research also identifies four distinct patterns of confounding effects, showcased through eight case studies, and introduces a novel interpretation of linear regression models using multiplicative edge diagrams. These advancements aim to improve the accuracy of confounding adjustments in data analysis.
Looking ahead, the study envisions integrating these findings into data mining tools like GrandReport, optimizing them for high-performance analysis, and exploring their applications in machine learning pipelines. This work marks a significant step towards addressing one of science's most persistent challenges, offering new tools and insights to enhance the reliability of quantitative research.
The thesis “Measures of Impact and Confounding – an Analysis and Experimental Comparison of Novel and Established Measures” is published in the Digital Collection of TalTech Library.
Supervisor: Prof Dirk Draheim
Opponents:
- Dr. Divesh Srivastava, AT&T Labs, New Jersey, USA
- Prof. Arun Kumar Sangaiah, National Yunlin University of Science and Technology, Yunlin, Taiwan
Meeting ID: 966 7567 5068
Passcode: 930272
NB! Before the defence, starting at 10:00 in room U05-103 Prof. Divesh Srivastava will give a presentation on "Modeling, Exploring and Analyzing Change: The Janus Project", you are welcome to join us!