I co-founded Precision Analytics with Erika, also holding a PhD from McGill and am an accredited statistician. While overseeing our …
Statistical Society of Canada Annual Meeting, 2026 - AI and The Case for an Audit-First Mindset
I attended the Statistical Society of Canada’s annual meeting at McMaster University this year, and discussions around the impact of AI and LLMs on the role of professional and academic statisticians ran through most of the sessions I attended, whether it was a great technical session on discrete choice experiments or a panel on statistical communication.
A workshop on day 1 really tackled this subject head-on, and I especially enjoyed Ehsan Karim’s (UBC School of Population and Public Health) session on AI-assisted literature work. The goal was not to learn one platform, but to learn what has to be checked before AI-assisted literature work is usable. That framing is important here, because literature research is often a less exciting task for statisticians, and therefore is one of the first things we would love to automate more. If we could outsource pulling out parameters for a simulation study, or data extraction for meta-analyses, it would save us a great deal of time and effort. The part of the workshop walked through some of these tasks, with discussion of how much independence you hand to the model, and showed that the real skill is not prompting but rather auditing the results.
Dr. Karim attached a real methods paper and asked a series of LLM models about a specific configuration that the study had never actually evaluated. Every term in the question was a genuine technical term from the paper, which is what made it dangerous: the model had every reason to sound confident. The correct response was to reject the question, and some did that, some quietly substituted a nearby result, and a few produced a precise-looking number for something that did not exist at all. The take-away isn’t that all LLMs are unreliable for this use case, but rather that the confidence of the answer is not necessarily indicative of the correctness of the answer.
Relatedly, when asked to synthesize across several papers, a model could sound genuinely insightful, but often it was reflecting a framing the user had supplied a few prompts earlier rather than discovering a connection in the papers themselves. This session had a lot of practical advice, including the “citation audit.” At the claim level, you do not accept an extracted fact without opening the source and reading the sentence, table, or figure it came from. At the reference level, you compare the title, authors, and journal at the DOI against what the AI reported.
Each check ends in a clear disposition: accept, correct, or reject. And the whole process is documented, including the model and version, the exact prompt, the sources, the human edits, and what was ultimately kept or discarded, so that the work remains defensible later.
At Precision Analytics, we are taking a gradual, incremental, and cautious approach to adoption of LLMs. For example, we might lean on an LLM for low-risk, easily-checked tasks like reformatting a reference list or drafting boilerplate for a report template, where any error is obvious on review. The value of our work rests on it being transparent, reproducible, and defensible under scrutiny. AI tools can genuinely accelerate parts of that work, but the responsibility for what gets reported does not transfer to the model. The most useful reframe from the day was that as you delegate more to an AI, your job shifts from doing each step by hand to designing the task, checking the evidence, and recording what you accepted, corrected, or rejected. That discipline has to scale at least as fast as the autonomy does, or it is not worth it. It is a principle we already try to live by, and a timely reminder as these tools become part of everyday research.
Ehsan Karim has kindly made the workshop materials mentioned above available online .
