Incikabi, Semahat2026-04-252026-04-2520250036-68031949-8594https://doi.org/10.1111/ssm.18402https://hdl.handle.net/11486/8449The objective of this study is to evaluate PISA mathematics literacy items in terms of their alignment with authentic contexts. A qualitative research design was adopted, employing document analysis to address the research aim. A total of 133 released items from 67 PISA contexts were analyzed. The findings indicate that the majority of the items exhibited a poor level of authenticity, while only a small proportion demonstrated a high degree of alignment with authentic contexts. Further analysis revealed that 88 items, which initially showed potential for being classified as good-fit or stereotypical based on their overall authenticity scores, were ultimately rated as poor-fit due to insufficient representation across certain authenticity aspects. The findings also suggest that three aspects (affective purpose, specificity of information, and question) play a pivotal role in the classification of items as poor-fit. Notably, nearly half of the items categorized as poor-fit remained at that level due to the absence of a single critical component. These results suggest that relatively minor, targeted revisions could substantially improve the authenticity of assessment tasks. The findings provide actionable insights for assessment developers, curriculum designers, and educators, offering clear priorities for enhancing the design of large-scale assessment items and informing policies that promote more meaningful and contextually relevant mathematics learning experiences.eninfo:eu-repo/semantics/closedAccessauthenticitylarge scale assessmentsmathematics educationPISAInvestigating Authentic Nature of PISA Mathematics ItemsArticle10.1111/ssm.184022-s2.0-105018498264Q1WOS:001586823600001Q30000-0002-7686-1996