Print/Save to PDF

Artificial Intelligence (AI) in Research

Artificial Intelligence (AI) is an ever-evolving field that presents new opportunities for research while amplifying the risk of data privacy and confidentiality. This guidance covers the use of AI, machine learning, deep learning, and related AI techniques used in research activities.

While there are many definitions, generally Artificial Intelligence (AI) is the imitation of human intelligence through technology, which enables a machine to learn and perform tasks that are typically associated with human intellect.

Canned language for consent forms

When appropriate, consider the following information when submitting a new application involving AI:

Investigators must include a Data Card when available for the particular research project. This includes:
- The source of the dataset used to develop the AI. If investigators develop their own dataset to train the AI model, include how this information will be collected, is this information private and/or will the data include identifiers.
- The purpose of the AI and if the study will utilize Generative or Predictive AI, i.e. will the purpose of the AI be to create predictions based off existing datasets or generate novel outcomes.
- Data disposition, including if the data will be used for future AI training, will the model be shared and with whom.
If the purpose of the research is to develop a new AI model or further develop an existing AI model, investigators should include a Model Card, which includes:
- A description of the model, intended purpose and audience, and how the model will be maintained.
- Limitations of the model and steps to mitigate risk.
- Steps on how investigators will monitor and evaluate the research outcome, and how this will be compared with the intended outcome.
In addition to the Data and Model Cards, investigators must ensure the Consent provides adequate information for subjects to make a fully informed decision and transparency in the use of AI and study purpose is paramount. The consent form should consider include relevant information for each of the following:
- If research data will be used to further develop the AI Model and if subject’s data will be used to develop other AI Models. The consent form must include the impact of data sharing broadly with an AI Model and that once shared, information cannot be withdrawn.
- Include thorough description of if data may be shared and that this information will not contain, directly and indirectly, information that may be used to identify individuals.
- State the purpose of the AI Model and any known limitations. This includes potential risks and how this will be mitigated.
- Include how data will be stored, in lay terms.
- If the research uses commercial AI software, include if subject data will be stored and used to further develop the product.
Lastly, investigator must instill the three ethical principles of the Belmont Report:
- Respects for persons: treating people as autonomous agents and protecting those with diminished autonomy.
- Beneficence: minimizing potential harms while maximizing benefits of participation. Avoid coding biased and regularly review output for bias predictions.
- Justice: Fairly distributing the benefits and risks.
For research regulated by the FDA, additional considerations may apply. See FDA guidance on AI and Machine Learning.

Investigators must also consider any additional policies, such as funding, , laws, legal or journal requirements, that may apply to the research. In addition to the information provided above, investigators should review the guidance on AI provided by MIT’s IS&T: Initial guidance for use of Generative AI tools. This guidance will help ensure investigators are following MIT’s best practices for the use of Generative AI.

As with all research, data privacy and confidentiality are of the upmost importance. Investigators must carefully consider the necessary information for the purpose of the proposed research and carefully consider how that information will be protected.

Canned language for consent forms

Use of AI to monitor and screen content:

During the process of developing the creative tool that is under research in this study, we will be making use of techniques to mitigate the exposure of any harmful or inappropriate content. These techniques will include automated content moderation to filter out inappropriate output from the tool or input from the participant. In cases where unforeseen harmful content is presented, you will [describe action when subjects encounter harmful or inappropriate content].

Subject data will be used to train the AI:

Your data may be used to help train and further develop the AI. Once shared, information cannot be withdrawn from the AI learning software and shared information may be available for future research and commercial interests. Your data will be kept confidential and any information used for development will not contain any identifiable information.