← Blog

How to Spot Hallucinated Citations in Your Thesis Before Your Examiner Does

An audit of 2.5 million papers found roughly 146,900 AI-generated fake citations in 2025 alone. If you used any AI tool while drafting, your bibliography may contain references that do not exist.

How to Spot Hallucinated Citations in Your Thesis Before Your Examiner Does

The scale of the problem is no longer theoretical

In May 2026, an audit of 2.5 million scientific papers identified approximately 146,900 AI-generated fake citations published in 2025 alone. The fabrication rate climbed from roughly 4 per 10,000 papers in 2023 to 56.9 per 10,000 by early 2026, a more than twelve-fold increase in two years. The inflection point was mid-2024, when AI writing assistants moved from experimental tools to daily workflow for many researchers.

This is not a problem limited to published journal articles. Thesis candidates who used ChatGPT, Gemini, or any other AI tool to help draft sections of their work face the same risk. The AI generates plausible-sounding references that look correct on the page. The DOI format is valid. The author names are real scholars in the field. The journal title exists. But the paper itself was never written. The specific combination of author, title, journal, volume, year, and page range is fabricated.

Resnik and Hosseini (2026), writing in Accountability in Research, argued that hallucinated citations may constitute research misconduct under the U.S. federal definition when citations function as data in scholarly papers. The argument is straightforward: a citation that does not exist is a fabrication, regardless of whether the fabrication was intentional or produced by a tool the candidate trusted.

What a hallucinated citation actually looks like

The danger of hallucinated citations is that they do not look wrong. They look exactly right. That is what makes them difficult to catch without systematic verification.

A typical hallucinated citation follows three patterns. The first is the plausible composite: the AI takes a real author's name, a real journal title, and a real-sounding paper title, then combines them into a reference that was never published. The author did write papers. The journal does exist. But that specific paper, in that volume, in that year, does not.

The second pattern is the close neighbour: the AI references a paper that is similar to a real one but alters the title, the year, or the co-author list. A candidate searching for the paper might find the real version and assume the minor discrepancy is a formatting error. It is not. The cited version does not exist.

The third pattern is the complete invention: author, title, journal, and year are all fabricated. These are the easiest to catch but also the least common, because modern language models have learned to produce more convincing composites.

GPTZero's analysis of NeurIPS 2025 accepted papers found hallucinated references across more than fifty papers at one of the most selective computer science venues in the world. If the problem reaches peer-reviewed papers at NeurIPS, it reaches thesis bibliographies.

Why your standard checks will not catch them

Candidates who check their citations typically do one of two things: they search the title in Google Scholar, or they paste the DOI into a browser. Both methods catch some fabrications but miss others.

Google Scholar may return a real paper with a similar title, leading the candidate to believe the citation is valid when it is a close neighbour. The DOI check fails when the AI generates a DOI format that looks correct but does not resolve, because many candidates do not actually click through to verify the destination page.

Neither method catches the most insidious failure: a citation that exists but does not support the claim being made. A candidate writes "Smith (2019) found that sample sizes below 40 are sufficient for qualitative saturation." The paper by Smith (2019) exists. But Smith's actual finding was about data saturation in grounded theory, not about sample size thresholds. The candidate cited a real paper for a claim the paper does not make. This is not hallucination in the technical sense, but it produces the same result in the examination room: a citation that does not hold up under scrutiny.

How to check every citation before submission

Systematic verification requires checking each citation against the academic record, not against your memory of what you read. The academic record lives in five registries, each covering a different aspect of a citation's validity.

Crossref resolves DOIs. If a DOI does not resolve, the paper either does not exist or the DOI is wrong. This is the first check and the fastest. Copy the DOI, paste it into doi.org, see if it returns the correct paper.

OpenAlex covers more than 250 million scholarly works, including books, conference papers, and working papers that Crossref sometimes misses. Search by title and first author. If OpenAlex has no record, the paper may not exist.

Semantic Scholar provides abstracts. Once you have confirmed a paper exists, read the abstract and verify that it actually says what you claim it says in your thesis. This is the claim-support check. It is the most time-consuming step and the most important.

Retraction Watch tells you whether a paper has been retracted since you cited it. A retraction can happen at any time. A paper you read in 2024 may have been retracted in 2025.

DOAJ (Directory of Open Access Journals) tells you whether the journal itself is legitimate. If the journal is not indexed in DOAJ and is not a well-known subscription journal, it may be a predatory venue.

Running every citation through all five checks by hand takes time. For a thesis with 150 to 300 references, the manual process takes days. Thesisroom's citation guard automates this process by querying all five registries for every entry in the bibliography and returning a traffic-light result for each citation.

What to do when you find a problem

A flagged citation requires a decision, not a panic. Three outcomes are possible.

If the citation does not exist at all, remove it. Find the real source you intended to cite, or find an alternative that actually supports your claim. Thesisroom's source finder helps locate verified replacement citations from the academic record when a reference is flagged.

If the citation exists but does not support your claim, revise your text. Either change the claim to match what the paper actually says, or find a different paper that supports the original claim. Do not leave a misaligned citation in place.

If the citation exists but has been retracted, remove it and note the retraction in your text if the finding was central to your argument. Citing a retracted paper without acknowledging the retraction is a red flag for examiners and may be classified as negligence.

The cost of not checking

The consequences of submitting a thesis with hallucinated citations range from embarrassment to career damage. At the lighter end, an examiner who spots a fabricated reference will question every other citation in the bibliography. The defence shifts from a conversation about your research to an interrogation of your sources. At the heavier end, universities that classify hallucinated citations as research misconduct can fail a thesis, suspend a degree, or place a notation on the candidate's academic record.

The March 2026 perspective in Accountability in Research made the legal framework explicit: under the U.S. federal definition, a hallucinated citation in a thesis submitted to a federally funded institution could qualify as fabrication. The classification does not depend on intent. The candidate does not need to have known the citation was fake. The standard is whether the citation was verified before submission.

Checking is cheaper than not checking. The time investment is a few days of systematic verification. The cost of not checking is measured in years.

Frequently asked questions

How common are hallucinated citations in student theses?

No large-scale audit of student theses has been published, but the patterns seen in published papers apply equally to thesis bibliographies. The 2026 audit that found 146,900 fake citations in published literature noted that hallucinated references were most common among early-career researchers and small teams. Thesis candidates working alone with AI tools fit that profile.

Can my university fail my thesis for a hallucinated citation?

Policies vary by institution, but the trend is toward treating unverified AI-generated citations as a form of academic misconduct. Resnik and Hosseini (2026) argued that hallucinated citations meet the federal definition of fabrication when citations function as data. Some universities have already updated their academic integrity policies to address AI-generated content, including citations.

Is it enough to check citations in Google Scholar?

Google Scholar catches obvious fabrications but misses close neighbours and claim-support failures. A paper with a slightly different title from a real one will return the real paper in Google Scholar, leading you to believe the citation is correct. Checking against DOI registries like Crossref, combined with reading the abstract via Semantic Scholar, is more reliable.

What if I did not use AI to write my thesis but still have citation errors?

Citation errors predate AI. Manual transcription mistakes, copying references from secondary sources without verifying the original, and outdated citation data are all common. The verification process is the same regardless of how the error was introduced. Check every citation against the academic record before submission.

How does Thesisroom check citations differently from a manual search?

Thesisroom's citation guard queries five academic registries (Crossref, OpenAlex, Semantic Scholar, Retraction Watch, DOAJ) for every entry in your bibliography. Most of the verification is deterministic: a DOI either resolves or it does not, a journal is either indexed or it is not. The AI is used only for the final step: judging whether a cited paper's abstract supports the specific claim in your text. Every flag shows the source that produced it.

Should I verify citations I found through my own library research?

Yes. Even citations found through legitimate library databases can contain transcription errors (wrong volume number, wrong year, misspelled author name). These errors are not misconduct, but they damage credibility. An examiner who finds three incorrect page numbers will read the rest of the bibliography with suspicion.

One check that protects the rest

A bibliography is a trust structure. Each citation tells the reader: I found this source, I read it, and it supports the claim I am making here. When one citation breaks, the trust structure cracks. When three break, it collapses.

Checking every citation before submission is the single highest-return investment a candidate can make in the weeks before a defense. The time cost is measurable. The risk of not checking is not.

Thesisroom's citation guard runs the full verification against the academic record for every entry in your bibliography. The integrity policy explains what the product checks and what it refuses to do.

References

Resnik, D. B., & Hosseini, M. (2026). Hallucinated citations produced by generative artificial intelligence may constitute research misconduct when citations function as data in scholarly papers. Accountability in Research. https://doi.org/10.1080/08989621.2026.2645390

citations · AI · DOI