You are three weeks into your literature review. You have 47 open tabs, a downloads folder full of PDFs named "article(3).pdf" through "article(27).pdf," a half-finished spreadsheet, and a growing sense that you have already read something relevant to your current paragraph but cannot remember where. This is not a personal failing. It is the predictable result of trying to manage information at a scale that exceeds what memory and improvised systems can handle.
The researchers who produce well-structured literature reviews and theses are not necessarily smarter or more disciplined — they use systems that make retrieval reliable even when the source count climbs past 50, 100, or 200 articles. This guide describes a practical system for organising, annotating, and tracking large numbers of sources, whether you are conducting a systematic review, writing a thesis, or preparing a comprehensive research report.
Most researchers are good at finding and collecting sources. The problem comes later, when you need to retrieve a specific piece of information — a statistic, a methodological detail, a theoretical argument — from a source you read weeks ago.
If your system makes retrieval slow or unreliable, you will:
The goal of a source organisation system is not to store articles — it is to make finding any specific piece of information fast and reliable.
If your PDFs are named "1-s2.0-S0140673620301835-main.pdf" (a common format from ScienceDirect), you have already lost the retrieval game. Rename every PDF the moment you download it, using a consistent convention.
A recommended format:
AuthorLastName_Year_ShortTitle.pdf
Examples:
Garcia_2022_CognitiveLoadAnnotation.pdf
WHO_2020_PhysicalActivityGuidelines.pdf
Smith_2019_MetaAnalysisInterventions.pdf
This takes five seconds per file and transforms your downloads folder from a chaos of meaningless filenames into a scannable library.
Automation tip: Zotero and Mendeley can rename attached PDFs automatically based on metadata. If you use a reference manager, configure this feature immediately.
A source database is any structured record of your sources with metadata and notes. It can be a reference manager (Zotero, Mendeley), a citation platform like DEEPNOTIS, a spreadsheet, or a Notion database. What matters is that it contains, at minimum:
Do not split this information across multiple systems. One database, one source of truth.
Organising sources by type (all journal articles in one folder, all books in another) is almost useless for writing. When you are drafting a paragraph about the effectiveness of intervention X, you need all sources related to intervention X — regardless of whether they are journal articles, reports, or book chapters.
Tag or label every source by the themes, topics, or arguments it is relevant to. A single source can (and usually should) carry multiple tags.
Example tags for a thesis on digital health interventions:
theoretical-frameworkRCT-evidencebarriers-to-adoptionuser-experiencecost-effectivenessmethodologybackgroundWhen you sit down to write the section on barriers to adoption, you filter by that tag and see every relevant source instantly.
In DEEPNOTIS: Use citation labels to apply multiple tags to each source. Labels persist across exports and can be used to filter your reference list by chapter, theme, or relevance.
Reading an article and highlighting interesting passages is not annotation — it is decoration. Effective annotation answers specific questions:
Write your annotations in a consistent location — either in your source database, in the PDF margins, or in a linked note. The key is that when you need the information six weeks later, you know exactly where to find it.
As your source list grows, it becomes impossible to remember which articles you have read thoroughly, which you have skimmed, and which you have only saved for later. A simple status system solves this:
Update the status as you work through your reading list. This prevents the common problem of re-reading articles you have already processed and helps you identify gaps in your coverage.
A reading log is a chronological record of what you read each day, separate from your source database. It serves two purposes:
A reading log can be as simple as a dated list in a text file:
2025-01-15: Garcia 2022 (cognitive load + annotation), Kim 2021 (stereotype threat)
2025-01-16: WHO 2020 guidelines (physical activity), Patel 2023 (digital health barriers)
Source organisation is not a one-time setup — it requires periodic maintenance. Every two weeks (or at natural breakpoints in your research), spend 30 minutes reviewing your source database:
The 30-minute rule: A 30-minute review session every two weeks prevents the organisational debt that otherwise accumulates and becomes a full-day crisis at the writing stage.
Before you begin drafting — or at least before you begin the final draft — export your "to cite" sources as a formatted reference list and cross-check it against your outline:
This cross-check takes 20 minutes and often reveals structural gaps that would otherwise surface only during writing, when they are harder to fix.
| Step | Action | Time investment |
|---|---|---|
| 1 | Rename files consistently | 5 seconds per file |
| 2 | Build and maintain a source database | 1-2 minutes per source |
| 3 | Tag by theme | 30 seconds per source |
| 4 | Annotate with purpose | 5-10 minutes per source |
| 5 | Track reading status | 5 seconds per source |
| 6 | Keep a reading log | 2 minutes per day |
| 7 | Review and reorganise | 30 minutes every 2 weeks |
| 8 | Export and cross-check | 20 minutes before writing |
The total time investment is small relative to the time wasted searching for lost sources, re-reading forgotten articles, and reconstructing missing citations during the final push.
DEEPNOTIS supports the workflow described in this guide: import sources by DOI or file, enrich metadata automatically, apply citation labels for theme-based organisation, and export a formatted reference list in your required style. The label system is particularly useful for researchers managing 50+ sources — each source can carry multiple labels, making it possible to filter by chapter, theme, or relevance without duplicating entries.
Whether you are building a systematic review, a thesis, or a research report, the difference between a well-organised source collection and a messy one is not talent — it is system. Start early, maintain consistently, and let the system do the retrieval work for you.
Booth, A., Sutton, A., & Papaioannou, D. (2016). Systematic approaches to a successful literature review (2nd ed.). SAGE Publications.
Ridley, D. (2012). The literature review: A step-by-step guide for students (2nd ed.). SAGE Publications.
Fink, A. (2019). Conducting research literature reviews: From the Internet to paper (5th ed.). SAGE Publications.