Skip to Main Content

AI for Research

This guide provides information and recommended resources on the applications of Generative Artificial Intelligence (GenAI) in research.

AI Concerns

Lack of Privacy & Transparency 

Many generative AI companies, including OpenAI, Anthropic and Google (creators of ChatGPT, Claude and Gemini, respectively), do not disclose information about their training data or methods. This lack of transparency can make it challenging for users to fully understand the biases and limitations of these tools and make informed decisions about their use. 

Entering personal or confidential information into generative AI applications carries risks, and the lack of proper data privacy and security controls by developers exacerbates the issue. Personal data shared with these applications may become accessible to other users, leading to identity theft and privacy violations. Similarly, confidential corporate information could be exposed to competitors or third parties.  

“The only foolproof solution? Keep sensitive data far away from LLMs.” (Falconer 2023)

Hallucinations & Misinformation 

Generative AI models are open ended: they have no inherent limitations on their output, allowing them to produce responses that are not necessarily grounded in fact or accuracy, and can potentially generate misinformation.  Some AI tools have been exploited to create fake images, videos, and audio recordings, known as "deep fakes," which can be used to spread disinformation and manipulate public opinion. 

AI-generated text can often appear authoritative and confident, making it challenging to distinguish between reliable and unreliable information. While AI can provide logical and coherent responses, it lacks any real ability to truly analyse information, context, and ultimately produce original thought. 

Furthermore, generative AI tools may provide outdated information, as they are often trained on historical datasets rather than having access to real-time data. This can result in dated or inaccurate representations of current events. 

As the end user of the AI's output, it is your responsibility to critically evaluate the quality of the information and detect potential issues or inaccuracies in the AI's response.

Bias 

Text or images generated by AI tools have no human author, human authors, but they are trained on data created by humans, which includes human biases. But unlike humans, AI tools are unable to consistently differentiate between biased and unbiased information when formulating responses: they are trained to simply predict the most likely sequence of words in response to a given prompt. As a result, AI systems like ChatGPT have been known to exhibit socio-political biases and, in some cases, generate responses containing sexist, racist, or offensive content. 

Generative AI tools that employ reinforcement learning with human feedback (RLHF) face an inherent challenge due to the potential biases of the human testers providing feedback. This kind of training also makes these models susceptible to confirmation bias: an AI chatbot is more likely to agree with the user than not (Nehring et al. 2024).

Copyright Infringement  

Using generative AI tools to create derivative works from copyrighted material may infringe on the copyright owner's exclusive rights. Some AI image and text generation tools have been trained on web content without the consent of the content owners, raising legal concerns. Several lawsuits have been filed against AI platforms for copyright infringement due to the use of artists' and writers' content without permission. While the outcome of these cases remains uncertain, some experts have cited fair use arguments in defense of AI training data usage. 

It is important to consider obtaining the necessary permissions before inputting copyrighted content into AI systems to avoid potential legal issues.

Environmental Cost 

The carbon footprint of large language models has far-reaching implications for both the environment and the economy. The high energy consumption of these models drives a case for privatising and commodifying the platforms, moving them away from open-source accessibility to subscriber-based models for corporate profit. This shift has economic implications, as the costs of carbon impacts are externalised and not factored into the operational costs or responsibilities of the platform. The carbon footprint spans the entire lifecycle of these models, from design and development to their operational phase and future legacy.

AI Ethics 

AI ethics is an actively developing multidisciplinary field that aims at addressing the concerns described above. It studies how to ensure that AI is developed and used responsibly to optimise its beneficial impact while reducing risks and adverse outcomes. This means adopting a safe, secure, unbiased, and environmentally friendly approach to AI. 

Further Reading 

Hallucinations, Misinformation & Bias

  • Vakali, A., Tantalaki, N. (2024). Rolling in the deep of cognitive and AI biases. arXiv:2407.21202.

  • Nehring, J., Gabryszak, A., Jürgens, P., Burchardt, A., Schaffer, S., Spielkamp, M., Stark, B. (2024). Large Language Models Are Echo Chambers. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 10117–10123, Torino, Italia. ELRA and ICCL. 

Copyright & Plagiarism

Quality

Privacy

Environmental Impact