Rise of Large Language Models

Applications based on large-language models are present in all media. In the lead article by Steffen Finck on the FHV cross-sectional topic of artificial intelligence, success factors are highlighted and impulses for FHV research are provided.

Pioneering research at the FHV

GPT-4, LLaMA and BLOOM - currently the headlines about so-called Large Language Models (LLM) dominate the broad media public as well as relevant professional media. LLMs are used as dialog systems; they have transformation potential for different areas. In addition, LLMs are used as underlying technology in other applications such as machine translation and the creation of summaries.

The Business Informatics research centre at the FHV has been successfully researching this for some time: in the field of Failure Mode and Effects Analysis for Products, or FMEA for short, the BERT (Bidirectional Encoder Representation from Transformers) model was used to analyze consistency in expert reports: Experts from different departments discuss the potential risks due to faulty product use or product defects. From a quality assurance perspective, this raises the question of the long-term consistency of FMEA assessments.

At the Business Informatics research centre, we investigated this question. We used BERT to identify representative keywords within the different dimensions of FMEA analysis and further developed a research methodology for consistency along different dimensions. The validation of the results shows that the developed methodology is able to group similar products and defect types in a meaningful way and that this provides the basis for a consistency assessment - see this article for more details.

Allocation of product data

A second project of our research centre deals with the analysis and allocation of diverse product data into a database. An optimized variant of the BERT model performs Named-Entity Recognition (NER) for the unstructured product data. NER can be used to identify persons, objects, companies and other proper names in a text. A specially developed assignment algorithm subsequently assigns the product information to the elements of the product database. This application significantly reduces the manual effort. A prototype of the application was realized in the context of an FFG innovation check, an improvement is currently being researched in the context of a bachelor's thesis.

Rapid development

Since the start of the millennium, research on artificial intelligence (AI) has developed rapidly. In particular, machine learning (ML), a special variant of AI, has been a strong driver of innovation. In ML, models for predictions and pattern recognition are developed based on data and using computing capacities. In the field of computer vision, the use of Convolutional Neural Networks (CNN) has achieved results that rival human performance.
Game learning

Next, AI learned to play. Games are popular environments for testing AI procedures: the AI learned to play against itself - "automating play," so to speak. More data is generated in this way, and reinforcement learning (RL) is used to continuously improve game-like performance. As drivers of innovation, these models are used to predict highly complex protein structures, or they are used to control robots and industrial equipment in complex environments.

Communicating with generative AI

What was missing until recently was the ability to "communicate" with an AI in natural language. This is not just about chatbots, but also about formulating work instructions or control commands. The processing of speech and text has been a sought-after field of application since the early days of AI research. Groundbreaking here was the development of the Attention Mechanism and the Transformer models based on it. Transformers learn meaning within sequences by learning the relationships between the elements of the sequences. Data preparation is much less involved than in conventional image processing, where images or objects in images must be manually labeled. An appropriately pre-trained AI, for example on the Wikipedia dataset (size about 21 GB), is now able to generate text. These models are called Large Language Models or Foundation Models. Due to their high complexity - LLM often have more than 108 parameters - they can be used, for example, for translations, summaries and chatbots.

Even generating images through natural language is possible: for example, the image below was generated by specifying "Create an image of a technical college in an Alpine region." Even if the image is not perfect, the task was qualitatively fulfilled.

Innovative and forward-looking

The use of LLM and generative AI is where the FHV's applied research comes in. Processing information from different sources enables new and high-quality predictions. For example, monitoring of business processes can be supported. By adapting existing processes, new innovative products and service offerings can be realized. Products will be able to communicate with each other in a comprehensible way, and they will achieve a greater degree of self-control. In addition, we can use this technology as a support to (partially) automate time-consuming manual processes or to provide additional information.

Rise of Large Language Models

More information

Contact

Internal

Rise of Large Language Models

More information

Contact