ISSN 2707-0476 (Online)

University Library at a New Stage of Social Communications Development. Conference Proceedings, 2023, No. 8

UniLibNSD-2023

THE CONTRIBUTION OF THEORY AND RESEARCH TO THE TRANSFORMATION OF LIBRARIES

UDC 004.8 + 001.8

YAROSHENKO T. O.

National University of Kyiv Mohyla Academy (Kyiv, Ukraine),

e-mail: yaroshenko@ukma.edu.ua, ORCID 0000-0002-2985-2333

IAROSHENKO O. I.

National University of Kyiv Mohyla Academy (Kyiv, Ukraine),

e-mail: yaroshenkooi@ukma.edu.ua, ORCID 0000-0002-4716-5705

Artificial Intelligence (AI) for Research Lifecycle: Challenges and Opportunities

Objective. This article aims to review the progress of AI technologies concerning their potential impact on academia, research processes, scientific communication, and libraries. Methods. AI tools for research lifecycle and their potential impact on academia and libraries were identified from various sources, mostly from the most influential recent scientific publications. Results. AI has become a driving force nowadays, creating both opportunities and challenges. Transformative AI-powered tools, exemplified by advanced models like ChatGPT, Llama-2, Google Bard, Microsoft Bing, and Jasper Chat, among others, find versatile utility across a broad spectrum of contexts, extending their impact to research process and publishing, as well as to librarianship. The enthusiastic embrace of AI in research is tempered by a pervasive concern over the potential for data fabrication, which can significantly compromise ethical standards and academic integrity. There is an urgent need to understand corresponding opportunities, challenges, and dangers. Some aspects of the use of AI tools for different stages of the research lifecycle are considered, and the main advantages and risks are analyzed. Conclusions. AI has the potential to drive innovation and progress in a wide range of fields and possesses significant potential to propel academia and librarianship into both exhilarating and challenging new frontiers. While AI-powered tools represent major advancements and potential to significantly impact academia, scholarly research, publishing, and university libraries. Privacy and bias are just two examples of the ethical considerations that need to be made.

Keywords: artificial intelligence; AI; LLM; ChatGPT; Llama-2; аcademic libraries; scientific communication; scholarly publishing; research lifecycle

Introduction

AI technology is already commonplace in everyday life and is no longer the only purview of futurologists. With the emergence of Artificial Intelligence technologies, services, and applications, there are many discussions about the inevitable changes in many areas, including AI’s impact on research and scholarly communication as well as on librarianship. The rapid pace of AI development and adoption raises crucial questions about intellectual freedom, equity and privacy, automation, the evolution of necessary digital literacy skills, relevant intellectual property policy frameworks, and more. Potential academic integrity violations, specifically potential bias, data fabrication and falsification, harm to critical thinking, etc., are the main concerns associated with employing AI. However, technological advancement is unavoidable, thus it is no longer viable to ignore or forbid AI services and applications. Scholars already employ them for a variety of research-related tasks, including searching, writing, proofreading, or translating texts, identifying and summarizing sources, data analytics, data visualization, data and text mining, coding, etc. Undoubtedly, over time, even more advanced tools will emerge, specifically designed for the needs of scientists. We undertook this brief study to reflect on this quickly evolving topic, primarily using the most pertinent and citeable publications. Over the last year, much has been written about Generative chatbots (especially ChatGPT) and their potential impact across the spectrum of research contexts. A Google Scholar search for the single phrase "ChatGPT" as of the date of this study (September 2023) returns over 46 600 papers, Dimensions – 17062, and Semantic Scholar – 89 400.

Furthermore, this number will probably increase too quickly. The majority of the debate is on moral dilemmas, specific tasks, technological specifics, and the application of AI in business and education. Scholars and librarians also need to understand the opportunities and challenges of AI use for the research lifecycle and analyze the main advantages and risks.

This article aims to review the progress of AI technologies concerning their potential impact on academia and libraries, research lifecycle, and scientific communication.

Materials and Methods

Analysis of scholarly articles within the 2017–2023 timeframe served as the basis for the research. AI landscape is dynamic and continuously evolving. AI services and applications were identified from various sources, mostly from most influential and highly cited books and articles in the field of AI tools for research and librarianship (Alto, 2023; Currie, 2023; Dashti, Londono, Ghasemi, & Moghaddasi, 2023; Dwivedi et al., 2021, 2023; Floridi & Chiriatti, 2020; Gasparini & Kautonen, 2022; Hervieux, & Wheatley, 2022; Hill-Yardin, Hutchinson, Laycock, & Spencer, 2023; Leslie, 2023; Lund & Wang, 2023; Manohar & Prasad, 2023; Vaswani et al., 2017; Yan et al., 2023), and was expanded through web searches and tools developed within key AI communities, such as the Open AI, Google AI, Microsoft Research, Stanford AI Lab, MIT Computer Science and Artificial Intelligence Laboratory, ICML (International Conference on Machine Learning), AAAI (Association for the Advancement of Artificial Intelligence), relevant IFLA statements etc. Additionally, we looked through the specialized publications (Journal of Artificial Intelligence Research (JAIR), Artificial Intelligence Journal (AIJ), Machine Learning Journal, IEEE Transactions on Artificial Intelligence, AI Communications, LIBER Quarterly, etc).

Results and Discussion

It is no longer a pipe dream to have a computer system that can reason and think like humans, as well as learn and develop. Numerous programs or chatbots that assist people in finding information online, calling a cab, conducting financial transactions, booking tickets, and scheduling doctor appointments are just a few examples of how AI technologies have already and seemingly invisibly permeated our lives. In the last decades, examples of AI also include virtual voice assistants such as SIRI or Amazon Alexa, face recognition and Google Maps, targeted advertising from sales and service platforms, etc. It would be possible to ignore the further development of AI, till AI has taken another leap forward in the release of the Large Language Model (LLM) software when the ChatGPT-3 (November 2022), and ChatGPT-4 (April 2023) – Generative Pre-trained Transformer chatbot – was released to the world by Open AI (US’research organization founded in 2015 by Elon Musk and Sam Altman, with the mission “to ensure that Artificial General Intelligence (AGI) benefits all of humanity” (openai.com), which caused a huge response in society. Only two months after the ChatGPT-3 release, the number of active users increased by 100 million, which became an all-time record among custom programs (Facebook needs 4.5 years for such a result). Since February 18, 2023, ChatGPT has become available for users from Ukraine (the program is inoperable in the parts of Ukraine that russia has temporarily annexed, as well as in russia itself, and Iran, Cuba, North Korea, Syria, and Sudan). ChatGPT-4 was launched in April 2023, and Meta and Microsoft announced the Llama-2 in July 2023. Other generative AI applications are being successfully developed and “learned”: Jasper Chat, Google Bard, Microsoft Bing, SciFact, Consensus, etc. ChatGPT is not the first sophisticated AI tool to change research practices. Semantic Scholar (www.semanticscholar.org) has provided free, AI-driven search and discovery tools since 2015. The most extensive platform for linked data is Dimensions.ai (app.dimensions.ai), which includes everything from patents and policy documents to grants, publications, databases, and clinical trials. Grammarly (www.grammarly.com) is an example of a popular AI tool (was “born” in Ukraine) used to improve academic writing. rTutor.ai (www.rtutor.ai) is an AI chatbot that can generate R code for statistical analysis. Research Rabbit (www.researchrabbit.ai) is an AI tool used to produce literature reviews. Etc.

Crossing the Rubicon? It became clear that this type of artificial intelligence (AI) technology would have huge implications for how researchers work. The creation of patents, the publication of scientific articles featuring generative AI as a contributing author (including ChatGPT), and other developments have alarmed the academic community. The editorial boards of scholarly journals, publishing houses, and universities are currently working tirelessly to develop appropriate policies for the usage of AI applications. The misleading fact that ChatGPT can produce – including the ability to "hallucinate" or conjure scientific terminology, phrases, and even quotes – has received both a lot of praise and a lot of criticism. AI usage as instruments of misinformation, false news, and malevolent content was criticized as well as the ethical and morally upright applications that might be made of them.

What is AI? The term AI is often used as an umbrella term for multiple technologies: business analytics and data science; natural language processing; speech recognition and text-to-speech; machine learning, deep learning, and neural networks; machine reasoning, decision making, and algorithms; computer vision; and robots and sensors (Williams & Lowendahl, 2018). Russell S. J., Norvig P., Popineau F., Miclet L., and Cadet C. (2021) defined the term AI to describe systems that mimic cognitive functions generally associated with human attributes such as learning, speech, and problem-solving. A more detailed and elaborate characterization was presented by Haenlein M. and Kaplan A. (2019), who describe AI in the context of its ability to independently interpret and learn from external data to achieve specific outcomes via flexible adaptation. We can even date the start of research on generative AI to the 1960s, when Joseph Weizenbaum developed the chatbot ELIZA, one of the first examples of an NLP (Natural language processing) system. Then, by the 2000s and 2010s, the advancement in computational capabilities, together with the huge amount of available data for training, yielded the possibility of making it more practical and available to the general public, with a consequent boost in research. Another great milestone was achieved when a new deep learning architecture, called Transformer, was introduced by the Google team in a paper, published in 2017, which has been cited more than 60,000 times by now – "Attention Is All You Need" (Vaswani et al., 2017). It was revolutionary in the field of language generation. Transformers were indeed the foundations for LLM called Bidirectional Encoder Representations from Transformers (BERT), introduced by Google in 2018. Transformers are also the foundations of all the Generative Pre-Trained (GPT) models introduced by OpenAI, including GPT-3 in November 2022 and GPT-4 in April 2023, a similar model behind ChatGPT. As a result, the years 2022 and 2023 have been termed the "years of generative AI" since they saw the widespread use of sophisticated AI tools and models.

One of the greatest applications of generative AI is its capability to generate human-like text, making it useful for a variety of natural language processing tasks such as language translation, summarization, and question-answering. Indeed, generative AI algorithms can be used to generate new text, such as articles, can be trained on large amounts of text data and then used to generate new, coherent, and grammatically correct text in different languages (both in terms of input and output), as well as extracting relevant features from text such as keywords, topics, or full summaries (Alto, 2023).

Researchers predict that AI will outperform humans in many activities in the next ten years, such as translating languages (by 2024), writing high-school essays (by 2026), driving a truck (by 2027), working in retail (by 2031), and even authoring a best-selling book (by 2049) or, moreover, practicing surgery (by 2053). According to research, there is a 50% possibility that AI will outperform humans in all tasks in 45 years and that all human employment will be automated in 120 years. (Grace, Salvatier, Dafoe, Zhang, & Evans, 2018).

What current research assistant roles may artificial intelligence (AI) technologies and applications fill for scientists? Let's list the most popular applications currently (the list is by no means exhaustive, and new ones are added quite frequently) along with the potential regions for their use.

1. Defining the research question, searching for sources, summarizing reviews, extracting data, synthesizing literature, generating texts (as well as proofreading, translating, copywriting, converting audio to text, etc.) – ChatGPT-3, ChatGPT-4, Llama-2, Jasper Chat, Google Bard, Microsoft Bing, WordAI, CopyAI, Wordtune, Grammarly, QuillBot, Semantic Scholar, SciSpace, etc.

2. Reference managers (formatting a bibliography, working with the text) – Research Rabbit, Semantic Reader, ChatPDF, etc.

3. Creating and editing images, video, etc. – ChatGPT-4, Dall-E, Synthesis.OI, Midjourney, Descript, etc.

4. Analysis of hypotheses and concepts – SciFact, Consensus, etc.

5. Providing support for the design and framework of the experiment (determining study design, identifying outcome measures), data analysis (incl. data mining and text mining) – ChatGPT, Llama-2, Jasper Chat, Google Bard, Microsoft Bing.

6. Generating a presentation of the study (for conference report or poster etc.) – Canva AI, Designs.ai, DesignerBot, Appy Pie’s Research Poster Maker, etc.

We narrowed our focus to a select few tasks where AI tools can help scholars. Many other activities within the domain of research could benefit from the support of AI: data collection, study participant recruitment, research networking, public engagement, and many others. Scholars who use AI tools in the research lifecycle can benefit from its versatility and time-saving features, ultimately leading to more impactful research outcomes. But it's crucial to remember that AI tools and applications are simply that: tools, and should be used in conjunction with expert knowledge and judgment. As with any research project, careful consideration of the research question and study design is necessary to ensure the validity and reliability of the results. (van Dis, Bollen, Zuidema, van Rooij, & Bockting, 2023).

University Libraries have always played a crucial role in the research ecosystem, often integrated into the research process not only as knowledge hubs but also in digital scholarship, research data curation, digital humanities, and more. Librarians' institutional expertise in knowledge management can be valuable for the challenges associated with the AI-oriented future of education and science. AI and ML could have the potential to add new dimensions and approaches to knowledge management processes in libraries – particularly knowledge organization, storage, and integration (IFLA, 2020). Naturally, libraries must create new tools and services based on AI technologies for a variety of library functions, including reference and information services, cataloging and metadata creation, content creation (e.g., to produce summaries, abstracts, and other types of content that can be used to improve access to library resources), and search and discovery (e.g., by understanding the nuances of natural language queries and providing more relevant results). Libraries can provide access to a wide range of AI-based tools and services to assist scholars in effectively leveraging AI technology in their research or even offer AI-related programming and services. The traditional role of libraries as trusted partners in research communities provides libraries with an opportunity and responsibility to educate their users on AI-related topics. Libraries can educate users about AI, and help them thrive in a society that uses AI.

We fully support IFLA's thesis on the role of libraries in implementing AI applications for researchers: “Libraries can support high-quality, ethical AI research. Many current ethical and inclusivity concerns linked to AI research and applications stem from incomplete, incorrect, or biased training data (‘garbage in, garbage out’). Trained librarians can lend their expertise in data storage and licensing, data quality assessment, and safe and ethical information storage to help researchers address some of the concerns around data. Libraries can also support ethical AI research and development by their procurement choices: purchasing AI technologies, which abide by ethical standards of privacy and inclusivity. This would both reaffirm the trust of users in libraries, and send a message to the AI research field by increasing the demand for ethical AI technologies” (IFLA, 2020). However, libraries should be mindful and employ AI tools responsibly and prudently. Libraries need to contemplate suitable measures to mitigate any potential risks associated with AI use.

Let's name the main disadvantages of AI applications for research and librarianship.

The most significant drawback of ChatGPT et al. applications is that the information they generate is not always accurate or unbiased. The biases in the training data may be reflected in the model's responses, which could lead to unfair or erroneous results. AI tools may offer incorrect links or even cite references to fictional studies or articles that do not exist (so-called fabrication, this flaw of chat is often called "hallucination" or "delusion"). For the academic environment, this is particularly dangerous given the risks of presenting incorrect data or non-existent results in one's research. Therefore, it is extremely important to always review and edit texts, and check citations and references to ensure their accuracy, reliability, and completeness.

Lack of context. ChatGPT and other AI chatbots may not be able to fully understand the context and nuances of scientific text, resulting in the creation of texts that might not always be the most appropriate or acceptable. In addition, ChatGPT-3, for example, "learned" on datasets up to 2021 (as of 2023), so it may not have modern context.

Intellectual property. AI chatbots can generate text that may be copyright-protected.

Security. AI chatbots can generate compassionate information, such as personal data, financial data, and even medical data, which could be a security risk.

Language. The accuracy and dependability of responses in other languages may be constrained by the fact that AI chatbots "learned" mostly from English-language sources.

Because the tools are evolving quickly, their capabilities will continue to grow. The ethical guidelines for universities and journals also need to evolve along with the application of generative AI. Some journals have already guided their authors. For example, in response to authors listing ChatGPT as a contributing author, Nature has developed guiding principles to help authors use and attribute generative AI text: AI does not qualify for authorship but the use of the technology should be documented in the methods section. (Dwivedi at al., 2023).

Conclusions

The use of AI tools and applications does not yet come to research without significant flaws. Biases, outdated training data, and lack of transparency and credibility are major concerns. The generated text (data) must be reviewed and edited to avoid plagiarism, fabrication, and falsification. It is now critical that the research community is aware of how and where these technologies can be used, and the main risks and dangers of bias or inaccuracy. AI is not responsible for the content of the research. If authors use AI technologies in the writing process, it is only to improve the readability and language of the work, and not to replace key tasks of the researcher, such as generating scientific opinion, analyzing and interpreting data, or generating scientific conclusions. The application of AI technologies must be under human supervision and control, and authors must carefully review and edit the output. It is critical also to identify and implement policies to protect against the misuse and abuse of generative AI.

But since we think that AI will be used eventually, we can't ignore it or forbid it. As artificial intelligence (AI) advances, we will soon see even more sophisticated tools made specifically for academia and scholars who need to know how to use them wisely for particular research practices, research process optimization, and dissemination of research results. AI can significantly advance librarianship and academia into exciting and demanding fields. However, it is imperative to discuss the ethical and responsible application of this technology. Instead of misusing it or allowing it to be misused academics and librarians should work together to use this technology to improve their jobs in the hopes of producing new scholarly knowledge and training future experts.

REFERENCES

Alto, V. (2023). Modern generative AI with ChatGPT and OpenAI models : leverage the capabilities of OpenAI's LLM for productivity and innovation with GPT3. Birmingham, UK: Packt Publishing. Retrieved October 12, 2023 from https://www.oreilly.com/library/view/-/9781805123330/ (in English)

Currie, G. (2023). Academic integrity and artificial intelligence: is ChatGPT hype, hero or heresy? Seminars in Nuclear Medicine, 53(5), 719-730. doi: https://doi.org/10.1053/j.semnuclmed.2023.04.008 (in English)

Dashti, M., Londono, J., Ghasemi, S., & Moghaddasi, N. (2023). How much can we rely on artificial intelligence chatbots such as the ChatGPT software program to assist with scientific writing? The Journal of Prosthetic Dentistry. doi: https://doi.org/10.1016/j.prosdent.2023.05.023 (in English)

Dwivedi, Y. K., Hughes, L., Ismagilova, E., Aarts, G., Coombs, C., Crick, T. ... Williams, M. D. (2021). Artificial Intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy. International Journal of Information Management, 57, 101994. doi: https://doi.org/10.1016/j.ijinfomgt.2019.08.002 (in English)

Dwivedi, Y. K., Kshetri, N., Hughes, L., Slade, E. L., Jeyaraj, A., Kar, A. K. … Wright, R. (2023). “So what if ChatGPT wrote it?”: Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. International Journal of Information Management, 71, 102642. doi: https://doi.org/10.1016/j.ijinfomgt.2023.102642 (in English)

Floridi, L., & Chiriatti, M. (2020). GPT-3: Its nature, scope, limits, and consequences. Minds and Machines, 30, 681-694. doi: https://doi.org/10.1007/s11023-020-09548-1 (in English)

Gasparini, A. & Kautonen, H. (2022). Understanding artificial intelligence in research libraries: An extensive literature review. LIBER Quarterly: The Journal of European Research Libraries, 32(1), 1-36. doi: https://doi.org/10.53377/lq.10934 (in English)

Grace, K., Salvatier, J., Dafoe, A., Zhang, B., & Evans, O. (2018). When will AI exceed human performance? Evidence from AI experts. Journal of Artificial Intelligence Research, 62, 729-754. doi: https://doi.org/10.1613/jair.1.11222 (in English)

Haenlein, M., & Kaplan, A. (2019). A brief history of artificial intelligence: On the past, present, and future of artificial intelligence. California Management Review, 61(4), 5-14. doi: https://doi.org/10.1177/0008125619864925 (in English)

Hervieux, S.& Wheatley, A. (Eds.). (2022).The rise of AI: implications and applications of artificial intelligence in academic libraries. Chicago, IL: Association of College and Research Libraries. (in English)

Hill-Yardin, E. L., Hutchinson, M. R., Laycock, R., & Spencer, S. J. (2023). A Chat (GPT) about the future of scientific publishing. Brain, Behavior, and Immunity, 110, 152-154. doi: https://doi.org/10.1016/j.bbi.2023.02.022 (in English)

IFLA. (2020). IFLA Statement on libraries and artificial intelligence. Retriеved from https://repository.ifla.org/handle/123456789/1646 (in English)

Leslie, D. (2023). Does the sun rise for ChatGPT? Scientific discovery in the age of generative AI. AI and Ethics. doi: https://doi.org/10.1007/s43681-023-00315-3 (in English)

Lund, B. D., & Wang, T. (2023). Chatting about ChatGPT: How may AI and GPT impact academia and libraries? Library Hi Tech News, 40(3), 26-29. doi: https://doi.org/10.2139/ssrn.4333415 (in English)

Manohar, N., & Prasad, S. S. (2023). Use of ChatGPT in academic publishing: a rare case of seronegative systemic lupus erythematosus in a patient with HIV infection. Cureus, 15(2). doi: https://doi.org/10.7759/cureus.34616 (in English)

Russell, S. J., Norvig, P., Popineau, F., Miclet, L., & Cadet, C. (2021). Intelligence artificielle : une approche moderne (4e éd.). Pearson France. (in French)

van Dis, E. A., Bollen, J., Zuidema, W., van Rooij, R., & Bockting, C. L. (2023). ChatGPT: five priorities for research. Nature, 614, 224-226. doi: https://doi.org/10.1038/d41586-023-00288-7 (in English)

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N. ... Polosukhin, I. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems, 30. doi: https://doi.org/10.48550/arXiv.1706.03762 (in English)

Williams, K. C., & Lowendahl, J.-M. (2018). 5 Best Practices for Artificial Intelligence in Higher Education. Gartner. Retrieved October 11, 2023 from https://www.gartner.com/en/documents/3895923 (in English)

Yan, Y., Li, B., Feng, J., Du, Y., Lu, Z., Huang, M., & Li, Y. (2023). Research on the impact of trends related to ChatGPT. Procedia Computer Science, 221, 1284-1291. doi: https://doi.org/10.1016/j.procs.2023.08.117 (in English)

YAROSHENKO T. O.

Національний університет «Києво-Могилянська академія» (Київ, Україна),

e-mail: yaroshenko@ukma.edu.ua, ORCID 0000-0002-2985-2333

IAROSHENKO O. I.

Національний університет «Києво-Могилянська академія», (Київ, Україна),

e-mail: yaroshenkooi@ukma.edu.ua, ORCID 0000-0002-4716-5705

Штучний інтелект (ШІ) для життєвого циклу дослідження: виклики та можливості

Мета. Стаття має на меті короткий огляд технологій ШІ з огляду на їхній вплив на наукову діяльність та наукову комунікацію. Методика. Інструменти та сервіси ШІ для життєвого циклу дослідження були визначені з різних джерел, переважно з найбільш впливових наукових публікацій за кілька останніх років. Результати. ШІ нині став рушійною силою, створивши як можливості, так і виклики чи проблеми. Трансформаційні інструменти ШІ, такі, наприклад, як ChatGPT, Llama-2, Google Bard, Microsoft Bing, Jasper Chat та ін., знаходять все більшу популярність, поширюючи свій вплив навіть на сферу наукових досліджень, зокрема для підготовки статей та інших типів наукових рукописів. Незважаючи на численні дебати навколо ШІ, очевидно, що їх інтеграція набула значного поширення в академічному середовищі, впливаючи як на практику проведення досліджень, так і на публікацію результатів. Водночас, захоплене сприйняття можливостей ШІ поєднується із занепокоєнням щодо можливих помилок чи фабрикації даних, що може суттєво поставити під загрозу етичні стандарти та академічну доброчесність. Існує нагальна потреба зрозуміти відповідні можливості, виклики та небезпеки. Розглянуто деякі аспекти використання інструментів ШІ на різних етапах життєвого циклу дослідження, проаналізовано основні переваги та ризики. Висновки. ШІ має потенціал для стимулювання інновацій і прогресу в багатьох галузях, включаючи наукові дослідження, тому потребує етичного та відповідального використання.

Ключові слова: штучний інтелект; ШІ; велика мовна модель; ChatGPT; Llama-2; наукові бібліотеки; наукова комунікація; наукове видавництво; життєвий цикл наукового дослідження

Received: 14.08.2023

Accepted: 15.11.2023

https://doi.org/10.15802/unilib/2023_294639