Proposal for a unified selection to medical residency programs

Sônia Ferreira Lopes Toffoli a, Olavo Franco Ferreira Filho b, Dalton Francisco de Andrade c

a Graduate Program in Production Engineering, Universidade Federal de Santa Catarina, Florianópolis, SC, Brazil. Universidade Estadual de Londrina, Londrina, PR, Brazil.
b Department of Internal Medicine, Universidade Estadual de Londrina, Londrina, PR, Brazil.
c Department of Informatics and Statistics, Universidade Federal de Santa Catarina, Florianópolis, SC, Brazil.


Objetivo: Este artigo propõe a unificação dos exames de acesso aos programas de Residência Médica (RM) no Brasil. São destacados problemas relacionados à RM e a sua interface com problemas da saúde pública no Brasil e como esta proposta pode auxiliar no enfrentamento desses problemas.

Métodos: A proposta consiste na criação de um banco de itens para ser aplicado na seleção unificada para a RM. São destacadas algumas vantagens em utilizar a Teoria de Resposta ao Item (TRI) nesse banco de itens.

Resultados: Os exames de seleção para os programas de RM são elaborados e aplicados de forma descentralizadas, cada instituição é responsável por sua avaliação. A qualidade dessas provas é questionável, os estudos referentes à qualidade dos itens, a validade e a confiabilidade dos instrumentos não são comumente divulgados.

Conclusão: A avaliação é importante em todo sistema de ensino, provocando transformações necessárias e monitoramento do ensino e da aprendizagem. A  proposta da unificação da prova de seleção para a RM, além de oferecer uma avaliação de qualidade elevada às instituições participantes, poderia servir como mais um recurso para avaliar e consequentemente provocar intervenções de melhorias nos cursos de graduação em medicina, fornecer dados para estudos e permitir uma mobilidade regional.



Medical Residency (MR) is a program of in-service training considered the best way to train physicians for professional practice. Currently, in some specialties, there is a discrepancy between the number of applicants and the number of MR vacancies, which results in fierce competition in the selection processes, especially in the more developed regions of Brazil.1,2

Although institutionalized for over 30 years in Brazil, MR lacks formulation of specific policies and mechanisms for coordination between the need and supply of professional training.3,4  Complying with this demand, the Programa Nacional de apoio à formação de médicos especialistas em áreas estratégicas (Pro-Residência) was launched in 2009, in order to favor the training of specializing physicians according to the needs of the Brazilian Unified Health System5 (Sistema Único de Saúde - SUS).

Studies approach the distribution of MR vacancies in different regions of Brazil, showing a concentration of programs and institutions that offer MR in the Southeast and South regions far superior than what is offered in the North and Northeast regions, but also that the percentage of scholarships offered in these disadvantaged regions has increased after the Pró-Residência program.1

Policies to encourage retention of physicians in underserved places are still being used worldwide. These incentives can be financial or not - although important, when not accompanied by other benefits, the salary increment loses its strength.6,7 Examples of non-financial incentives are good housing conditions and schooling for children, continuing professional education options, possibility of career advancement, inclusion of assistance programs to residents as a preventive measure to stress and anxiety, and differentiated competition conditions for vacancies in MR.7-10

Among other proposals suggested by experts as complementary actions are the implementation of new MR programs in less privileged regions, refinancing educational debts and granting scholarships to students and residents in exchange for medical practice in underserved areas, the creation of a national career and proposing changes in the entrance examinations of MR.6,7,10,11 The policies adopted to solve or alleviate the shortage of health professionals in barren areas are varied, but among them actions related to medical education are always mentioned, particularly with MR.

Currently, experts are concerned with improving the health care system to provide more efficient and humanistic services. There are several studies, debates and meetings on professional training and its integration into the health system. The National Meeting of Medical Entities (Encontro Nacional de Entidades Médicas) in 201012 resulted in some proposals concerning the formation of Medical Residents, among which: the guarantee of a MR vacancy for each student who concludes the undergraduation, expansion of MR vacancies according to demand from the public health sector, unifying the selection criteria of MR tests, minimizing the subjectivity in the assessment of inflow of residents and non-prioritization of the local egress over candidates from outside.

Fully meeting the majority of these proposals, and adding it to the current policies, this research proposes an unified selection exam throughout the country in order to change the access model to MR programs, adding consistent advantages with current actions aimed at training physicians and guiding these professionals to priority areas in SUS.

The uneven geographical distribution of physicians is considered serious since it generates shortages of professionals in remote areas. In addition to the shortage of physicians to work in primary health care, there are few medical specialists.1,5,7,13 In the article Distribuição de Vagas de Residência Médica e de Médicos nas Regiões do País, after citing examples from various studies around the world, Nunes et al.,2 categorically state that the supply of quality residence helps to fixate physicians in the regions in which they attended such programs, and that the MR - not the undergraduation - is largely responsible for redistributing physicians.

The proposal for unified selection processes with a similar model to the National Secondary Education Examination14 (ENEM, acronym in Portuguese) could contribute to tackling the problem of lack of assistance in underprivileged areas. In such models, the candidates ranked on a common scale could be directed, respecting their preferences, to far locations, meeting the current public policies to guide specialists to underserved regions.

The unified process for MR vacancies in Brazil would be done through a bank of items developed according to criteria of validity and reliability, described by current measure theories, in which the candidates would be ranked on a common scale.

As vacancies in the most competitive institutions are filled, the candidates are distributed to other institutions, according to their rank. It is believed that due to some advantages, many candidates would accept the possibility of attending MR in other locations, even without this being the first option considered. One of these advantages is the possibility of immediate continuity of one's formation; another is the opportunity to experience other communities and cultures. This procedure is being followed since 2009 for targeting applicants from institutions of higher education participating in ENEM.14

Scholars in the field of medical education indicate the need for changes in the form of access to MR vacancies, especially with regard to objective proof of knowledge in the basic areas.2,15 This test generally has its items analyzed according to the Classical Test Theory (CTT), but this analysis is only possible after the application of the tests. Any problems with items are detected late, and often the quality of the selection is impaired. The problems with the items may occur in relation to the preparation, with conceptual problems, regarding discrimination or difficulty.

Considering the complexity involved in preparing a selection exam and that every institution in the various regions of the country is responsible for their selection processes to MR, it is easy to see that some of these tests have questionable quality both in relation to the instrument applied and the contents required. Thus, a unified examination could contribute to an improvement in the quality of selection, with its constructed instruments according to current concepts of validity and reliability, and could foster a retroactive effect for a uniform curriculum of undergraduate courses in Medicine in Brazil.

Entrance exams to medical residency programs

The selection process for MR, at the institution's discretion, may be conducted in two phases: written and practical. The first phase is mandatory and consists of a written, objective examination, with an equal number of questions in the specialties of Internal Medicine, Surgery, Pediatrics, Obstetrics and Gynecology, and Preventive and Social Medicine, weighing at least 50%. The second phase is optional, at the institution's discretion, consists of a practical test with a weight of 40-50% of the total score and curriculum analysis with maximum weight of 10% of the total grade.16

Currently, in Brazil, 53  specialties and 54  areas are recognized. Among the 53 specialties, 29 require direct access, i.e., the only prerequisite is a graduation degree in Medicine. For other specialties, requiring indirect access, prior training in the basic areas of knowledge is required.2

The proposed unification of processes would include the objective test in the first phase for specialties with direct access, since currently these specialties are a priority for SUS. The other steps, with fewer competitors, could be decided by the institution.


The proposed unified selection in medical residency programs

The selection processes for medical residency are admittedly necessary and have the pretention of differentiating students who have knowledge and skills considered important, in addition to being responsible for guiding the curriculum of undergraduate courses.

Although these tests satisfactorily fulfill a role in ranking the candidates, they have drawbacks. One of them is the decentralization of selection processes, which limitates the options in the race for one of the positions offered. On the other hand, they restrict the ability to recruit by institutions, disadvantaging those situated in locations further away from the major city centers.

One alternative change in the MR selection would then be the unification of vacancies through one single test, democratizing participation in selection processes and guiding young people in their fields of interest to underserved regions, adding to the current public policies for medical training and guidance of these professionals to priority areas for SUS. Furthermore, this unification of selection processes would be important, due to the possibility of promoting a curriculum integration among the various undergraduate courses in the country.

Use of item response theory in building the item bank

A high scale assessment as proposed in this study has a strong influence on educational policies, curricula of the institutions that are part of the process, and the professional future of young candidates, highlighting the importance of examining the many variables involved in the construction, application and scoring of the evaluation.

High scale evaluations commonly make use of a bank of items, which can be defined as a database of items relating some elements of each item, such as parameters, contents assessed, and others.17

Item parameters can be estimated by CTT, but it has some limitations. The main limitation is that measurements depend on the measurement instrument used (test) and on those assessed. This type of instrument can lead to errors and uncertainties in the selection process of candidates, since contests in general fill vacancies based on the raw scores obtained by the candidates in tests, being simply the count of questions answered correctly.18,19 This means that the score achieved by a candidate does not only depend on their own ability, but also on the level of difficulty of the test. Therefore, the comparison between responders can only be made if they undergo the same instrument.17,20

An alternative to the currently widely used CTT in preparing high scale assessment is the Item Response Theory (IRT), which consists of mathematical models to represent the relationships between the individual characteristics of the respondent (skills), and item characteristics (difficulty, discrimination, correct answer at random).

In IRT, the relationship between the probability of an individual answering correctly to an item and the parameters of this item is an incremental function, called item characteristic curve (ICC). Figure 1 illustrates the characteristic curve of an item considered efficient.21

Fig. 1 - Characteristic curve of the item. Source: BILOG-MG.21

Fig. 1 - Characteristic curve of the item. Source: BILOG-MG.21

Parameter b is a measure of item difficulty and is given in the same unit of skills (horizontal axis). In Figure 1, we notice that the higher the ability of the candidate, the greater his probability (vertical axis) to answer correctly to the item. The scale used is (0,1), i.e., the mean is zero and the standard deviation is 1. Drawing a vertical line on a skill, at the intersection of this line with CCI the probability of a candidate with that ability to answer correctly to the item is provided. Note that the further to the right is the ICC, the more difficult the item is.

IRT was developed in the United States in 1952 by Lord,22 and in the Netherlands in 1960  by Rasch.23  This method became known mainly from 1968, with the work of Lord and Novick entitled Statistical Theories of Mental Tests Scores.24

Some current examples of large-scale assessment using IRT in their item banks are Graduate Record Examination (GRE), Scholastic Assessment Test (SAT), Test of English as a Foreign Language22 (TOEFL), Programa Internacional de Avaliação de Alunos25 (PISA), Prova Brasil,26 Sistema de Avaliação da Educação Básica27 (SAEB), ENEM,14 among others.

Using the templates provided by IRT, the items are designed to assess each latent trait and, from these, one can obtain the description of each item with some of its parameters, such as latent trait assessed, discrimination index, difficulty level, probability of success to chance. In IRT, the process of estimation of the item parameters is known as calibration.20

Some advantages of the unification of the selection process in MR programs by means of a calibrated item bank according to the IRT include:

• Using pre-tested items: items are classified according to difficulty, discrimination, probability of success at random, among others. Faulty items in preparation are discarded.

• Developing a range of unique skills: item parameters and abilities of respondents are placed on a common scale, making it possible to compare candidates and item parameters, even if from distinct tests or groups.

• Allowing student mobility: enables candidates to participate in the selection process in their city of origin, without the need to travel to the location of the desired institution. This is also a way of democratizing access, not prioritizing the local egress over candidates from outside.

• Developing several editions of selection test: equivalent tests are organized as needed. Candidates are graded on a single scale, so idle vacancies can be avoided, even in the institutions of the least favored regions.

• Deepening analyses: a qualitative analysis of the items and the respondent population is possible, from the answers of individuals.

• Guiding education: standardization of curricula of all undergraduate medical courses in the country.

Professional associations, the government and scholars highlight the need for changes to the MR and particularly the access way, as noted earlier in this article. It is expected, therefore, that the creation of a bank of items to be applied to unified selection exams for MR, calibrated according to item response theory, provides a substantial gain in quality to current processes of selection, meeting the demand of institutions for a more efficient selection and also the needs of SUS as for redistribution of professionals in the country.


Currently, the selection tests for MR programs are developed and implemented in a decentralized way, each institution is responsible for the assessment. Experts question the quality of these tests, their subjectivity and lack of fairness, since many of these places are disputed by both local candidates and by candidates coming from other city centers.2,12

The institutions responsible for MR programs not usually leave on their official websites reports with studies about the quality of the items in their tests, or about the validity and reliability of their instruments. In fact, in Brazil, the number of researches on high scale assessments can still be considered limited. Researchers of the assessment field criticize both the scarcity of studies on the issue of Brazilian tests, discarded without assessing their validity, reliability and impacts in education and society, and studies on the poor official disclosure of the results of these examinations by the agencies responsible.28,29

This scarcity of studies on high scale assessments in Brazil results in poor transparency of the processes involved in the preparation, correction and scoring of exams. Another factor also related to this poor literature is the lack of resources and other support from the government and agencies responsible, able to support and encourage research on these assessments.

Problems with the quality and difficulty of items are common and can be detected only after using the instrument.18-20,30,31 This is the reason why it requires pre-tests in building an item bank. In fact, when developing an item bank for assessing learning contents in Spain, Garcia et al.32 excluded about 25% of the items, a number similar to that achieved when building the bank Hezinet, a multimedia system used to Basque language teaching to foreigners.30 In its methodology to develop the item bank and constant selection of new items to be included in this bank, ENEM uses pre-testing in a systematic way to high school students across the country.26


Problems with items in high scale tests, such as vestibular (an university entrance exam in Brazil), public tenders or selection examinations for MR are common, but few are reported in the press or on the websites of the institutions responsible. Candidates are only informed of items canceled due to conceptual problems. Problems with the quality of items, such as discrimination, difficulty or probability of right answer at random are not normally disclosed. It is also not common to disseminate studies on the validity and reliability of the instruments used in these exams.

Studies prove that 20-25% of items of an instrument are normally excluded, due to conceptual problems or inefficiency as for discrimination. An item bank developed using correct pedagogical procedures, pre-tested before composing the assessment instrument, has a great advantage, and with quality often superior to traditional tests that are part of the selection programs for MR.

To elaborate a good assessment instrument, the items should be assessed in pedagogical, quantitative and qualitative terms. These analyzes must verify whether the assessment meets the objectives and criteria for validity and reliability. IRT models allow greater flexibility in the drafting process of tests and in analyzing the answers given by the people assessed, in addition to resulting in a more fair and consistent classification.18-20,31

The entire procedure of drafting items and pre-testing to compose a large item bank calibrated as per IRT is set and well-defined since the reformulation of ENEM in 2009.14 ENEM now occupies a prominent place in the Brazilian education sector for its contribution in the democratization of access to higher education, unifying this access to a large number of universities.26

The knowledge acquired through the ENEM experience, if replicated in the unified selection for MR programs, could provide a substantial gain in quality to current processes of selection, meeting the demand of institutions for a more efficient selection and also the needs of SUS as for redistribution of professionals in the country.


One of the problems faced by medical education is in expanding the number of courses without minimum guarantees on the quality of the course.1,3 The educational quality requires systematic assessments and pre-set targets. Assessment is important in any education system, causing changes needed and monitoring teaching and learning.28,33,34

Furthermore, the proposed unification of the selection exami nation for MR could also serve to assess the undergraduation courses in Medicine. Also, due to the possibility of IRT models to compare different tests, and also compare individuals who answer the tests, these assessments could provide data for many studies, including longitudinal.

Another problem highlighted in the specific literature is the misdistribution of specialized physicians in the various regions of the country. Physicians play an essential role for the health care of the population, but the country faces serious problems regarding lack of assistance in some isolated areas of large cities.1,5,7,13 Political actions should be directed to the training of qualified specialists, according to the health needs of each region of the country. The proposed changes in selection examinations of MR are always mentioned by experts as one of the ways to deal with this problem.7,10,11

As in ENEM,14 which aims to select candidates for positions in undergraduation courses, the use of a unified examination for MR programs will allow candidates to compete for vacancies offered by all institutions, thus promoting regional mobility.

