PHAROS AI Factory announces the 3rd Course of its Training Series, LLMs Track: “Introduction to Large Language Models at Scale“, held online via Zoom.
Presentation language: Greek
Audience: The primary target audience consists of AI/Machine Learning Engineers, Data Scientists, and HPC Engineers working in industries that leverage supercomputing for large-scale modeling and simulation. This includes professionals in technology, finance, and research sectors seeking to optimize performance and access advanced European infrastructure. It also targets software developers aiming to build and scale next-generation LLMs like those for specialized language domains.
Location: Online via Zoom (you will get the zoom link upon registration)
Description:
- Introduction to High Performance Computing: We will analyze the fundamental components of an HPC System, including the roles of CPUs, GPUs, and Nodes in parallel computing environments. Furthermore, the talk will explore how HPC accelerates Artificial Intelligence (AI), specifically through Parallel Stochastic Gradient Descent (PSGD), and conclude with an overview of the Top 500 list and the European supercomputing ecosystem.
- How to prepare an access proposal for Greek and European HPC Clusters: This talk provides an overview of the Greek and EuroHPC Joint Undertaking (JU) Access Calls and the key elements required for a successful application. We will cover the specific resource needs and evaluation criteria.
- Introduction to LLMs: We present the basics of LLM fundamentals, focusing on key components that constitute an LLM architecture, including data tokenization, attention mechanisms and other general architectural choices. We then proceed with training details and objectives, including techniques to combat overfitting.
- Building Greek Large Language Models: Meltemi & Krikri: A dynamic overview of the Greek Large Language Models Meltemi and Krikri, focusing on dataset choices, training and adaptation methods (finetuning), as well as evaluation approaches. This section aims to offer a comprehensive first look at the technological foundations behind Greek LLMs, paving the way for their future advancement and setting the stage for the topics that will be presented in Pharos AI’s upcoming training programs.
- Fine Tune LLMs on a Single GPU(hands-on): This talk explores the practical aspects of Fine-Tuning Large Language Models (LLMs) utilizing the high-level Hugging Face Trainer API. We detail the process of adapting a Llama 3.2-1B-Instruct model for a specialized task using FuggingFace datasets. The core focus is demonstrating efficient training loop management and analyzing the performance gains between the base and fine-tuned models.
- Scalability of LLMs: Following the introduction to LLMs lecture, we proceed with LLM inference details, including the several sampling strategies that are used in practice and how LLM learning is ultimately actionable. We conclude by introducing the emergent abilities of LLMs, i.e. human-like abilities that LLMs implicitly acquire at scale, justifying the need for HPC resources for efficient inference.
- Fine Tune LLMs on a node with multiple GPUs (code presentation): This talk dissects a distributed fine-tuning workflow for LLMs using the Hugging Face Trainer and torch.distributed.run. The demonstration focuses on adapting a Llama 3.2-1B-Instruct model by tokenizing a custom markdown corpus for causal language modeling. Key sections include setting up the multi-GPU environment using a Slurm job script and showcasing the performance difference between the base and fine-tuned models on domain-specific questions.
Learning Objectives
- Introduce the fundamental concepts of High-Performance Computing (HPC), including its hardware components and role in accelerating AI through Parallel Stochastic Gradient Descent (PSGD).
- Provide a comprehensive understanding of the application process for accessing Greek and European HPC resources.
- Explore the technological foundations and development of Greek Large Language Models (LLMs), specifically Meltemi and Krikri.
- Offer a practical, hands-on experience in fine-tuning LLMs efficiently on single and multiple GPU environments using tools like the Hugging Face Trainer API and torch.distributed.run.
Learning Outcomes:
- Identify the core components (CPUs, GPUs, Nodes) of an HPC system and explain how they enable parallel computing, specifically in AI workloads (e.g., PSGD).
- Understand the structure and key elements required for successful access proposals to Greek and EuroHPC clusters.
- Describe the development pipeline, including dataset choices, training, adaptation (finetuning), and evaluation methods, for Greek LLMs (Meltemi & Krikri).
- Apply the Hugging Face Trainer API to fine-tune a Llama 3.2-1B-Instruct model on a specialized dataset using a single GPU.
- Implement a distributed fine-tuning workflow for LLMs in a multi-GPU node environment, utilizing Slurm and torch.distributed.run, and evaluate the performance gains.
The course’s agenda can be found here.
Register to attend by filling out the form here.

