Balancing scale and depth: Publishing models for large-scale and small-N research in systems neuroscience

Sana A. Ali; Andrew Lynn

doi:10.52294/001c.160138

Introduction

The landscape of scientific publishing is undergoing a significant transformation. The last decade has seen an explosive growth of digital research outputs from terabyte-scale datasets to complex codebases, multimodal workflows, and interactive computational notebooks. In parallel, cultural shifts toward open science, transparency, and equitable access have challenged traditional publishing models.^1,2 Neuroscience sits at a unique crossroads of these revolutions. On one end are massive population datasets such as the Adolescent Brain and Cognitive Development study® (https://abcdstudy.org) and the Human Connectome Project, which promise unprecedented statistical power and generalizability.^3,4 These large-scale population datasets have enabled the identification of robust organizational principles, normative developmental trajectories, and population-level heterogeneity in brain-behavior relationships. On the other end is the rise of small-N, dense-sampling approaches that emphasize detailed, within-individual characterization of brain organization and dynamics.⁵ This approach has revealed fine-grained, individual-specific features of brain organization that are often obscured by group averaging.^5–7 Together, these approaches have driven advances ranging from network-level models of brain organization to mechanistic insights into individual variability and change over time.

While these issues resonate across areas of science, this commentary is primarily grounded in systems, computational, and neuroimaging-oriented neuroscience, where questions of scale, reproducibility, and analytic transparency have become especially salient. In these domains, methodological choices are closely intertwined with publishing practices, evaluation criteria, and incentives that shape what kinds of contributions are visible and valued. Accordingly, our focus is on publishing and evaluation norms, rather than on adjudicating the empirical validity of specific study designs.

Yet despite this methodological diversity, one bottleneck persists: the publishing ecosystem. Current editorial incentives, peer-review expectations, and impact metrics systematically privilege scale over depth, novelty over transparency, and narrative results over reproducible workflows. As a result, critical scientific contributions, including code, data, negative results, and detailed methodological descriptions, often remain undervalued or unpublished.

Scientific progress, however, depends on aligning methodological choices with the questions we aim to answer. Different questions require different forms of evidence: some demand statistical breadth to characterize population variability, while others require mechanistic depth to resolve individual-level structure or patterns that emerge only through dense, repeated sampling. No single study design can meet all inferential needs, and treating one approach as inherently superior obscures the value of pluralistic scientific inquiry. Clarifying the research question first, specifically what phenomenon we seek to characterize, predict, or explain, determines which methodological scale is most appropriate.

Small-N studies allow for researchers to densely sample individuals which allows for precisely mapping brain function and detailed hypothesis refinement. At the other end, large-scale population studies are often better suited to examine individual differences, estimate stable population-level associations, and test generalizability. Across both approaches, a focus on reproducibility remains imperative and can take different forms. Throughout this commentary, we refer to reproducibility to mean the ability to reproduce results from shared code and workflows (workflow reliability), the stability of estimates within individuals or datasets (measurement reliability), and the extent to which findings replicate across samples (population generalizability). Similarly, terms such as mechanistic or causal inference depend on study design, dependent variables, and strategies used to isolate causal relationships rather than sample size alone.

Complementary, not competitive: Approaches along a methodological continuum

What big data knows

Large-scale datasets prioritize between-subject effects, enabling scientists to detect broad patterns in brain and behavior, characterize population-level variability, and build predictive models with improved generalizability. These studies excel at identifying what varies and how much it varies across individuals and contexts.⁸ They anchor neuroscientific inference in robust statistical foundations and provide reference benchmarks for the field.

What small-N knows

Dense sampling approaches prioritize within-subject effects, often with hours of data across repeated sessions within a single participant.^5,7 This approach is necessary to precisely map individual-specific brain connectivity features.⁷ These studies excel at identifying how and why something varies, providing mechanistic insight, and driving methodological innovation, which can later be evaluated and generalized using large-scale datasets. Their strengths lie in granularity and experimental flexibility, which are qualities that large-scale designs cannot easily match.

Why scale became dominant

The recent emphasis on large-scale datasets in neuroimaging did not arise arbitrarily but was motivated by important methodological concerns. Brain-wide association studies (BWAS) and related consortium efforts highlighted the instability and inflation of effect size estimates in small samples, as well as the difficulty of detecting reliable brain-behavior relationships across thousands of features.⁹ Large datasets provide improved statistical power, more stable effect size estimates, and stronger norms for replication and multiple-comparison control. These developments have substantially advanced the field by establishing clearer expectations for statistical robustness in population-level analyses. At the same time, subsequent discussions have emphasized that scale alone cannot resolve all inferential challenges.¹⁰ Measurement heterogeneity, confounds, and the challenge of interpreting brain-behavior associations that span many regions and are often difficult to map onto specific biological or cognitive mechanisms remain ongoing limitations.¹¹ Rather than rendering smaller or mechanistic studies obsolete, the rise of large-scale studies suggests investigator-led approaches occupy different positions along a methodological continuum, with the most appropriate design determined by the research question being asked.

Why the field needs both

These approaches are epistemologically complementary, not antagonistic, and sit at opposite ends of a continuum. Population studies can uncover heterogeneity that small-N studies can explain. Small-N studies can illuminate mechanisms that the population studies test at scale. Methodological safeguards such as preregistration and registered reports provide important protections against analytic flexibility and publication bias, particularly in small-N research. These practices strengthen inferential transparency and internal validity, but they do not fully address challenges related to effect-size stability or population generalizability. As such, they should be viewed as complementary for large-scale studies designed to evaluate robustness across individuals. Importantly, there are scientific contexts in which N=1 designs are not a limitation but a necessity. In clinical and biomedical neuroscience (i.e., lesion studies, rare genetic disorders, neurosurgical cases, and individualized treatment response), single participant investigations can be uniquely informative and clinically actionable. One influential example is the case of patient H.M., who developed profound anterograde amnesia following bilateral medial temporal lobe resection.¹² This single-case observation provided foundational evidence linking the hippocampus to memory formation and shaped subsequent theories of memory systems. In these settings, individual variability is the signal rather than the noise, and generalization occurs through mechanistic insights, replication across cases, or translational relevance rather than population-level statistics. Recognizing these exceptions further underscores the need for publishing frameworks that accommodate diverse inference regimes. Taken together, healthy scientific ecosystems require both large-scale and small-N studies, and publishing models must reflect that duality by prioritizing both (Table 1).

Table 1.Comparing Large-Scale vs. Small-N Approaches in Systems Neuroscience

Dimension	Large-Scale	Small-N
Primary Strengths	High statistical power; strong generalizability; broad pattern detection across individuals and subgroups.	Deep mechanistic insight; fine-grained individual mapping; high ‘resolution’; methodological flexibility.
Core Scientific Values	Characterizing population variability, prevalence, and large-scale organizational principles.	Characterizing individual variability, identifying potential mechanisms, and dynamic processes.
Innovation Potential	Tool for validating new methods and testing broad hypotheses at scale.	Incubator for methodological innovation and paradigm development.
Data Structure	Breadth: many subjects, fewer measurements by subject.	Depth: few subjects, many repeated measurements.
Advantages for Reproducibility	Greater stability of statistical estimates and improved population-level generalizability when samples capture between-subjects variability.	High within-individual measurement reliability due to repeated observations; transparent workflows can facilitate computational reproducibility in separate datasets.
Reliability vs. Generalizability	Captures population heterogeneity and supports generalizable estimates across individuals.	Provides high reliability of measurements within individuals through dense repeated sampling.
Limitations	Logistically expensive and complex; rigid design; may obscure individual variability; risk of overfitting.	Limited generalizability; vulnerable to individual-specific anomalies; time-consuming for subjects.
Publishing Challenges	Reviewer burden due to data scale; pressure for novelty leads to underreported workflows.	Bias against small samples; difficulty meeting traditional “impact criteria”; additional effort to publish code/pipelines.
Ideal Use Cases	Mapping large-scale networks, studying developmental or clinical heterogeneity, and building predictive models.	Testing novel paradigms, validating measures, identifying mechanisms, and developing pipelines.
Primary Contributions to the Field	Breadth → “what varies and how much.”	Depth → “how and why it varies.”

Note: “Small-N” refers broadly to study designs with a limited number of participants, including dense-sampling and precision mapping approaches that prioritize repeated measurements within individuals. The comparisons in this table highlight typical methodological tendencies rather than strict dichotomies and represent two ends of a continuum. In practice, large-scale and small-N approaches differ in implementation and goals, and their relative strength may depend on factors such as effect size, measurement reliability, and population heterogeneity.

Publishing incentives: How the current system picks ‘winners’

The privilege of N

Traditional publishing norms heavily reward sample size, novelty, and “impact” often defined through citation metrics or journal prestige. This creates a structural advantage for large population studies: their scale alone signals rigor, even when the inferential goals of the research might not require such breadth. As a result, small-N designs, despite offering mechanistic insights and methodological innovations, are frequently perceived as niche or insufficiently generalizable. This dynamic privileges large population studies, inadvertently marginalizing small-N research, which may be equally rigorous but less aligned with conventional expectations.¹³ Importantly, the result is an incentive system that conflates the number of individual human subjects with research quality, sidelining approaches that answer different, equally important scientific questions.¹⁴ These concerns align with broader reform efforts such as the San Francisco Declaration of Research Assessment (DORA), which has long advocated for meaningful evaluation of research quality and contribution and less emphasis on journal-based metrics toward more.¹⁵

Invisible outputs

Modern neuroscience depends on outputs that fall outside the traditional manuscript format: code, preprocessing pipelines, quality-control procedures, standardized workflows, null results, and computational tutorials. Yet these contributions remain largely undervalued within the existing reward systems within the publishing ecosystem. These contributions are rarely recognized with the same weight as narrative research papers, despite being essential for reproducibility and scientific integrity.¹⁶ This undervaluation discourages researchers, especially trainees, from rigorously notating code and procedures, as they rarely count toward career advancement and are often rejected by journals because they do not necessarily present new empirical findings.

Consequences for the field

These distorted incentives have measurable costs. A fixation on novelty and scale contributes to reviewer fatigue, prolongs publication timelines, and creates pressure to overlook methodological justification. Moreover, by undervaluing code and methodological transparency, journals inadvertently perpetuate the reproducibility crisis: findings become difficult to validate, pipelines remain opaque, and research communities lose the ability to build effectively on prior work.^17,18 The field is left with a paradox: we generate more data and methods than ever before, yet our publishing structures often prevent those resources from meaningfully contributing to cumulative scientific progress, which is valuable, builds public trust, and ensures safe clinical transition.

Principles of reform: From gatekeeping to cultivating scientific progress

Rewriting review criteria

Peer-review guidelines should explicitly recognize transparency, reproducibility, and methodological clarity as core markers of scientific rigor. Evaluation must prioritize whether the evidence presented is appropriate for the research question, whether the analytic workflow is reproducible, and whether limitations are clearly articulated. Shifting the focus from sample size and predicted “impact” toward quality and openness would allow both large-scale and small-N studies to be judged on their scientific merits rather than on ingrained assumptions about scale. This recalibration would also reduce bias against exploratory, mechanistic, or tool-building work that currently struggles to fit traditional review expectations.

Expanding what counts

Modern neuroscience generates diverse scholarly products (i.e., code libraries, datasets, containerized workflows, preprocessing pipelines, and methodological tutorials) that meaningfully advance the field but seldom qualify as publishable “papers.” Journals can address this gap by supporting dedicated article formats that formally recognize these outputs as first-class research contributions. The success of the Brain Imaging Data Structure (BIDS) illustrates how standardization and open documentation can transform scientific practice by promoting reuse, interoperability, and cumulative discovery.¹⁹ Several journals have already begun to implement aspects of these reforms, including requirements for code and data availability, assignment of DOIs to software and datasets, and structure review of computational materials. For example, some journals now require authors to deposit code in public repositories with identifiers, while others offer dedicated article formats for software, data, descriptors, or reproducible workflows. These initiatives demonstrate that reform is not speculative, but already underway in certain segments of the publishing ecosystem. Furthermore, by highlighting such contributions, publishers can reinforce behaviors that directly enhance scientific progress.

Attribution that reflects reality

Scientific research is increasingly collaborative and computational, yet authorship conventions still center on narrative writing and data analysis. Development of software infrastructure, maintenance of pipelines, curation of datasets, and creation of documentation are often critical to a research project’s success but remain undervalued. Structured contribution taxonomies such as CRediT provide a mechanism for transparent and equitable attribution, ensuring that each contributor receives recognition aligned with their actual role.²⁰ Assigning persistent identifiers (DOIs) to code, data, and workflows further allows these outputs to be cited, tracked, and incorporated into academic CVs and evaluation processes.

Remodel scientific communication

The static PDF is poorly suited to a field driven by computational workflows. Neuroimaging analyses depend on complex pipelines, machine learning models, and/or interactive visualizations that cannot be fully captured in traditional formats. Publishing executable, interactive artifacts such as Jupyter notebooks, containerized analyses, or cloud-hosted demos would provide readers with direct access to the analytic logic behind the findings. This shift would not only improve reproducibility but also democratize methods by making it easier for trainees and researchers without extensive computational resources to learn and apply new techniques to their small-N data sets. Notably, several platforms already operationalize these principles. Initiatives such as Evidence (formerly NeuroLibre) integrate executable analyses and code, while platforms like PubPub support dynamic, community-driven scientific communication.^21,22 Similarly, publishing models adopted by eLife have shifted away from traditional binary accept-reject decisions by publishing reviewed preprints with public assessments, emphasizing transparent evaluation rather than journal prestige.²³

Benefits of methodological pluralism

Faster innovation through diverse methods

Small-N approaches serve as methodological incubators, enabling rapid iteration, experimental flexibility, and detailed examination of mechanisms. Researchers can refine analytical tools, validate new measurement strategies, or pilot emerging technologies well before these methods can be scaled to larger studies. Large datasets then provide the complementary platform needed to evaluate these innovations across populations, test their generalizability, and identify boundary conditions. This bidirectional flow from exploratory depth to population-level validation creates a more efficient discovery-validation cycle than either method can achieve alone. Recent advances in computational methods further extend this ecosystem, as machine learning, simulation, and data augmentation increasingly allow researchers to combine empirical datasets with synthetic or simulated data to explore model behavior and improve generalization.²⁴ These hybrid workflows can help bridge gaps between breadth and depth, but they also introduce new challenges for scientific evaluation, including questions of data provenance, validation against real-world observations, and potential bias amplification.

Democratizing discovery

A publishing ecosystem that values diverse contributions expands who can meaningfully participate in neuroscientific research. When methodological rigor, transparent reporting, and reusable workflows are recognized alongside scale, smaller laboratories and early-career researchers gain opportunities to produce influential work without requiring extensive computational or financial resources. This shift reduces structural barriers that currently concentrate “high-impact” research in a subset of well-funded institutions. By recognizing contributions such as software, protocols, and method papers, the field can reward excellence in many forms, not only those enabled by rich infrastructure.

Toward cumulative, reusable science

Methodological pluralism flourishes when scientific outputs are designed to be reused, adapted, and extended. At the same time, the distinction between large-scale and small-N approaches is increasingly porous. Emerging practices, including large, deeply phenotyped single-participant datasets, federated and synthetic data approaches, and brain-wide association frameworks, challenge simple dichotomies of scale. Rather than erasing the need for methodological pluralism, these hybrid regimes reinforce it by combining depth, breadth, and new forms of inference. Publishing systems must therefore remain flexible enough to evolve alongside these shifting methodological landscapes. Open workflows, containerized pipelines, shared codebases, and transparent reporting practices allow researchers to build directly on one another’s work rather than reproducing entire analytic infrastructures from scratch. Resources such as NowIKnowMyABCD exemplify how community-supported infrastructure with shared analytic resources can make large datasets more accessible by providing clear, standardized workflows for analysis.²⁵ Such reuse lowers barriers to replication, reduces redundancy across labs, and increases the longevity of scientific contributions.^11,26 Importantly, it also facilitates integration across methodological scales: innovations developed in dense-sampling paradigms can be rapidly tested in population datasets, and population-level discoveries can inspire targeted mechanistic studies. Pluralism thus strengthens neuroscience not only by diversifying methods but by making the field more cumulative, connected, and efficient.

Prioritizing mechanistic insight

A pluralistic, methodological landscape also strengthens neuroscience by elevating mechanistic understanding as a central scientific objective. Small-N, investigator-led designs are uniquely positioned to test hypotheses, identify causal processes, and characterize individual variation – insights that large population studies cannot access with scale alone. Conversely, population datasets clarify which mechanisms are widespread, which are context-specific, and which may represent idiosyncratic adaptations. Integrating these approaches allows the fields to move beyond descriptive mapping toward explanations that link neural organization to behavior, development, and intervention. Prioritizing mechanistic insight ensures that discovery is not defined solely by detecting effects, but by understanding the processes that generate them.

Challenges and realities: Avoiding the open science mirage

Infrastructure Isn’t Free

Open science is often framed as a matter of willingness rather than resourcing, yet the practical demands of maintaining FAIR-compliant repositories, long-term storage, versioning systems, and containerized workflows are substantial.²⁷ Without sustained institutional and financial support, “open” resources may be ephemeral.²⁸ The result is an “open” ecosystem that looks robust on paper but is fragile in practice, placing the burden of maintenance on individual researchers rather than on the systems that benefit from open data.

Documentation debt

Releasing code without documentation, metadata, or usage examples creates an illusion of transparency without real reproducibility.^29,30 Poorly annotated scripts or undocumented pipelines cannot be meaningfully reused or audited, which undermines reproducibility and may even mislead users. Documentation debt accumulates quickly, especially in large collaborative projects, and addressing it requires explicit recognition, time, and reward structures. Treating documentation as an optional afterthought reinforces the gap between nominal openness and actual usability. Importantly, these costs are not distributed evenly across the scientific workforce. Early-career researchers and trainees often take on a disproportionate share of the labor associated with documentation, code maintenance, and data sharing, despite having lower institutional power and job security. When such efforts are undervalued in hiring, promotion, and funding decisions, open science practices can inadvertently create career risks rather than benefits. Addressing this imbalance requires not only technical infrastructure but explicit recognition and reward structures that align with career sustainability.

Reviewer burden

As research outputs become more complex, peer review must evolve as well. Traditional reviewers are often asked to assess codebases, large datasets, or containerized workflows without having the time or, in some cases, the specialized expertise needed to evaluate them rigorously. This dynamic creates inconsistent standards and risks overburdening reviewers who already face significant time demands. Journals may need dedicated technical editors, software reviewers, structured code-review processes, and/or incentives to ensure quality while distributing the evaluative load more equitably across the community.

Standardization vs. innovation

Community standards are essential for interoperability, discoverability, and data reuse, but standardization can inadvertently impede innovation if applied inflexibly.³¹ Overly prescriptive requirements may limit experimental creativity or hinder the development of novel analytic approaches that fall outside established schemas.³² Publishing models must therefore balance the advantages of standardization with the need to accommodate emerging methods, allowing researchers to propose well-justified deviations from norms when scientifically appropriate. A healthy ecosystem supports convergence where helpful, and divergence where necessary.

Recommendations for building a system that rewards what we value

Realigning publishing practices with the values of transparency, methodological pluralism, and cumulative science requires coordinated action across journals, funders, institutions, researchers, and the broader community. No single stakeholder can reform the system alone: journals shape evaluation criteria, funders determine what work is financially sustainable, researchers produce the outputs that must be recognized, and communities create the standards and infrastructures that make openness feasible. Table 2 summarizes concrete, actionable steps for each group. These steps are mutually reinforcing and collectively capable of building a publishing ecosystem that rewards what the field claims to value.

Table 2.Recommendations for building a publishing ecosystem that supports transparency, methodological pluralism, and cumulative science

Stakeholder	Recommendations
Journals	Adopt article types for data, workflows, code, and null results. Require reproducibility checklists and data/code availability statements. Provide infrastructure for interactive, executable materials.
Funders & Institutions	Recognize and financially support the work of data stewards and research software engineers. Incentivize reproducibility, documentation, and open workflows.
Researchers	Normalize the citation of code, data, and pipelines. Share well-documented, versioned, reproducible workflows using community standards.
Community	Co-develop shared infrastructure (metadata standards, open pipelines) to reduce redundant effort and ensure sustainability.³¹

While systemic reform will require coordination across stakeholders, several practical editorial practices could help align publishing incentives with methodological pluralism without substantially increasing reviewer burden. First, journals should explicitly evaluate studies relative to their inferential goals rather than relying on sample size as a proxy for rigor. Second, publishers should expand article formats that recognize methodological and infrastructural contributions (i.e., codebases, workflows, and standardized pipelines), allowing these outputs to be evaluated independently of narrative results. Third, structured contribution taxonomies (i.e., CRediT) should accurately credit roles such as software development, data stewardship, and methodological development. Finally, where appropriate, journals should separate the technical review of computational workflows from the evaluation of scientific novelty, distributing the evaluation burden across reviewers with relevant experience. These steps represent incremental adjustments that could better align editorial incentives with the diverse methodological contributions that drive progress in modern neuroscience.

Conclusions

Neuroscience advances most effectively when it embraces methodological pluralism. Large-scale population datasets provide opportunities for convergence, shared observational baselines, and statistical power that can anchor field-wide inference. They connect researchers to common reference points and enhance replicability, yet these strengths come with tradeoffs: creativity and innovation may be constrained by the breadth-first logic of large studies, effect sizes can be difficult to interpret, and the “streetlight effect” risks prioritizing what is easy to measure over what is scientifically most illuminating.

Investigator-led, small-N research offers a complementary and equally essential set of strengths that sits at the opposite end of a methodological continuum. These designs provide deep resolution on individual differences and mechanistic processes. Particularly, these insights cannot be generalized statistically, but they are indispensable for explanation, theory building, and intervention design. Their flexibility supports methodological innovation and allows researchers to “bridge the data,” connecting decades of expertise to specific phenomena. Yet they, too, carry limitations: restricted generalizability, susceptibility to sampling biases, and a need for careful attention to statistical detection and effect-size interpretability.

Although mechanistic and intervention-focused studies often align naturally with small-N designs, discussions across the field highlight that these approaches should be viewed as points along a continuum rather than as categorical opposites. Each provides different affordances – some tied to depth, some to breadth, and many that shift depending on the research question and the level of inference solicited. The absence of a shared language for describing this continuum complicates how we evaluate and communicate scientific findings. Developing a common conceptual framework will help clarify how diverse methodological strategies contribute to discovery and guide more balanced decisions about study design and publication.

Methodological approaches vary in the types of questions they can address, meaning that no single design can fulfill all inferential goals. Population designs foster stability across the field and investigator-led designs foster creativity and mechanistic depth. The publishing system must therefore evolve to recognize the epistemic interdependence of these approaches. A model that rewards scale alone risks suppressing innovation, while one that endorses only depth risks disrupting the field. Scientific progress relies on the dynamic interplay between breadth and depth, convergence and creativity, replication and explanation.

To support this pluralistic ecosystem, reforms in publishing practices are essential. Expanding the definition of rigor to include transparent methods, well-documented workflows, reusable code, and null results will help align incentive structures with the realities of modern computational neuroscience. Recognizing the contributions of data stewards, software engineers, and method developers is equally essential, as these roles underpin the reproducibility and sustainability of the field’s scientific output. By valuing the full spectrum of research products and not just narrative findings, journals can foster a culture in which investigators are encouraged to build, document, share, and refine the tools that make discovery possible.

The brain-mapping community is exceptionally well-positioned to lead this transformation. It has already pioneered open-data frameworks, community standards, and computational tools that have shaped the trajectory of the field. The next step is to extend this leadership into publishing itself: to model a system that rewards what we value, supports methodological pluralism, and enables all researchers, regardless of scale, resources, or institutional setting, to contribute meaningfully to scientific progress.

A pluralistic, inclusive publishing ecosystem addresses existing limitations and is a pathway to accelerating discovery, strengthening reproducibility, and democratizing scientific contribution. Aligning incentives with the full breadth of methodological contributions to scientific inquiry will help ensure that insight, rather than scale alone, shapes future advances. Large-N and small-N approaches each provide unique strengths, and their utility is determined by the research question at hand. When the publishing system supports methodological pluralism, the field is better positioned to generate cumulative and mechanistically grounded knowledge.

Funding sources

The authors have nothing to disclose.

Acknowledgements

Flux 2025 Pre Conference Workshop

This commentary was inspired by the Flux Annual Meeting Pre-Conference Workshop titled “Designing Investigator-Led Studies in the Era of Big Data,” organized by Drs. Andrew Lynn and Xiaoqian Chai, which was held in Dublin, Ireland, on September 4th, 2025. We thank the organizers and the participants of this workshop for their engagement and permission to share their insights.

Conflicts of interest

The authors declare no competing interests.