AWS and Atos showed how gamification speeds up AI training for hundreds of employees

Q: What is the source?

Originally published on AWS Machine Learning Blog. Hamidun News processes and adapts the material with AI.

Q: When was it published?

May 2, 2026. Reading time: 4 min.

AWS explained how Atos used the AWS AI League program to accelerate AI training within the company. The two-week league involved 409 people, and teams…

Hamidun News Editorial

AI monitoring · AWS Machine Learning Blog

May 2, 2026· 3 min

AI-processed from AWS Machine Learning Blog; edited by Hamidun News

AWS and Atos showed how gamification speeds up AI training for hundreds of employees — Source: AWS Machine Learning Blog. Collage: Hamidun News.

◐ Listen to article

AWS showcased an Atos case study where corporate AI training was transformed into a competitive league instead of another batch of courses and certificates. Within two weeks, 409 participants created over 4,100 fine-tuned models and gained hands-on experience that could be immediately applied to client projects.

How the league was structured

Atos already had a strong foundation: more than 5,800 AWS certifications and 11 Golden Jackets within the company. But this proved insufficient for the goal of making the entire staff AI-literate by 2026. A problem familiar to many large organizations: employees complete training but don't always progress to confidently deploying models in real work. That's why Atos and AWS chose the AWS AI League format—not lectures for the sake of compliance, but a series of practical tasks with rankings, deadlines, and a finale in live show format.

Introductory workshop on fine-tuning in SageMaker JumpStart
Selecting Meta Llama 3.2 3B Instruct as the base model
Preparing a JSONL dataset for an insurance scenario
Fine-tuning, deployment, and answer verification in SageMaker
Scoring on the leaderboard and selection for the live final

After the opening workshop, a two-week virtual league began. Participants repeatedly adjusted datasets, learning rates, epochs, batch sizes, and LoRA parameters to climb the results table. In the online round, models were evaluated by an automated LLM-as-a-Judge based on Llama 3.2 90B. The top five finalists then advanced to the live final, where the overall score consisted of three components: 40% from the LLM judge, 40% from five Atos experts, and 20% from audience voting. Finalists had just 90 seconds per task to adjust the system prompt and inference parameters.

Real insurance use case

For the operational task, Atos chose not an abstract demo scenario but an assistant for insurance underwriting—the Intelligent Insurance Underwriter. The model needed to parse complex insurance situations, assess risks, suggest policy terms and deductibles, recommend premium adjustments, and explain its reasoning. This case demonstrates well the value of fine-tuning: general language proficiency alone isn't enough if you need to confidently work with industry-specific terminology, exceptions, and decision-making rules. Here, it's not just about text generation but applied accuracy within the domain.

Technically, participants worked in Amazon SageMaker Studio and SageMaker JumpStart, where infrastructure was largely abstracted away. For training, they assembled JSONL datasets from instruction/response pairs, uploaded them to Amazon S3, and ran fine-tuning without deep diving into ML operations. AWS specifically notes that dataset size alone didn't guarantee better results. Those who succeeded were those who cleaned data, increased example diversity, and systematically tested hyperparameters rather than simply generating as many records as possible. Within the league, separate tools were even used for dataset generation and improvement.

A separate lesson emerged from the overfitting problem. Some models performed well on familiar examples but began repeating themselves or giving irrelevant answers to new questions. This was especially evident when tested on 87 unseen questions from the leaderboard. This is why participants had to learn not just to run fine-tuning but to monitor eval-loss, perplexity, and model behavior during inference to distinguish real improvements from cosmetic metric gains. For corporate training, this is an important point: people mastered not just interface buttons but the logic of working with models and result quality.

Why it worked

The main effect came not from the workshop itself but from the competitive mechanics around it. After the league launched, participants simultaneously shared findings in work channels, attended office hours, while trying not to fully reveal their strategies to competitors. As a result, Atos achieved the highest engagement level among its gamified programs: 409 people on the leaderboard and over 4,100 fine-tuned models created. To crack the top 5, a model needed to show at least a 93% win rate against responses from a much larger model. This transformed training from a formal activity into an engineering task with clear goals and visible progress.

The business takeaways for Atos also proved quite practical. According to AWS data, a fine-tuned 3-billion-parameter model was able to outperform a 90-billion-parameter model in a narrow domain when it had relevant data and proper tuning. For companies, this is an important signal: in agentic systems, you don't always need the largest general-purpose LLM. Small specialized models are cheaper to run inference on, respond faster, and scale more easily. The AWS article even provides an infrastructure contrast: ml.g5.4xlarge versus ml.g5.48xlarge for the larger model. After the program, 85% of participants reported feeling more confident in conversations with clients about generative AI, and the entire training cycle took two weeks instead of the months traditional preparation would require.

What it means

The Atos case shows that corporate AI training is shifting from passive courses toward short practical cycles with measurable results. For companies wanting not just to train employees in AI but to bring them to actual implementations, a format with an industry case study, a leaderboard, and continuous iteration looks notably more effective than standard theory and one-off certifications. Especially where the business needs not general GenAI knowledge but specialists capable of quickly assembling a working domain-specific assistant.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation