IT Center Help

Sie befinden sich im Service: LLM Hosting

Available Models

This page contains information on models available in the LLM context.

We provide a collection of open-weight large language models (LLMs) which are self-hosted on our HPC infrastructure, giving you flexibility and choice in your workflows to choose the ones that best fit your needs. User prompts and requests are processed in real time only. The content of these prompts is not saved, logged, or stored at any point. Your data is therefore handled with high protection standards to ensure confidentiality and security.

Model updates

Our model list will be updated in regular intervals, taking into account our available hardware resources as well as observed demand, usage patterns and utilization.

Current model list

Model	Provider	Release Date	max. Content Length	Capabilities	Limitation and Comments
Mixtral-8x22B	Mistral AI	2024-04-17	64K	Excels in reasoning, mathematics, coding, multilingual benchmarks	large but a bit older
Mistral-Small-3.2-24B	Mistral AI	2025-06-25	128K	Compact 24B model optimized for low-latency inference; Good overall performance and quality
Apertus-70B	Swiss-AI	2025-09-02	64K	Strong performance among open models on multilingual / reasoning benchmarks; Medium to good overall quality
gpt-oss-120B	OpenAI	2025-08-06	128K	Strong reasoning, coding and benchmark performance; Very good overall performance and quality	weaker in multilingual or niche domain areas

Details: Mixtral-8x22B

Mixtral 8×22B is an open-source large language model (LLM) developed by Mistral AI, released in April 2024 under the permissive Apache 2.0 license. It employs a Sparse Mixture-of-Experts (SMoE) architecture:

The model spans 141 billion total parameters, but typically activates only 39 billion parameters per inference, thanks to the MoE design
It consists of 8 experts, each with 22B parameters, with 2 experts activated per token

Key Capabilities & Strengths:

Efficiency & Cost-effectiveness: The SMoE setup allows Mixtral 8x22B to be faster and more cost-efficient than many dense models of similar or larger size (e.g., LLaMA 2 70B)
Large Context Window: It supports a context window of 64,000 tokens, enabling it to process and recall large documents with precision
Multilingual Proficiency: Mixtral 8x22B is capable in several languages, i.e. English, French, German, Italian, and Spanish, performing strongly on multilingual benchmarks compared to models
Strong Performance in Math & Coding: It excels in reasoning-intensive tasks, coding, and mathematics. On benchmarks like GSM8K, HumanEval, and others, Mixtral 8×22B achieves top-tier scores—e.g., ~90.8% on GSM8K maj@8

Details: Mistral-Small-3.2-24B

Mistral-Small-3.2-24B belongs to the "Small" series from Mistral AI, specifically an enhanced iteration of the earlier Mistral Small 3.1 and Small 3 series. All 24B-parameter models are released under the Apache 2.0 license. It was officially introduced around June 2025, with model card version 2506 representing the "Instruct" variant optimized for instruction following.

Key Capabilities & Strengths:

Stronger Instruction Following: Shows major improvements on tough benchmarks (Wildbench, Arena Hard), making it more reliable for guided tasks.
Reduced Repetition & Infinite Loops: Generates cleaner outputs with fewer cases of runaway or repetitive text.
Enhanced STEM & Coding Abilities: Higher accuracy on programming (HumanEval+, MBPP+) and reasoning-heavy benchmarks.
Vision Input Support: Can handle images as input, enabling multimodal tasks like document parsing or visual Q&A.
Extended Context Window (128K tokens): Capable of working with very long documents, transcripts, or multi-step workflows.

Details: Apertus-70B

Apertus-70B is a 70 billion-parameter, fully open, multilingual LLM released in 2025 by Swiss public institutions (EPFL, ETH Zurich, CSCS). It sets transparency, compliance, and multilingual reach as core design principles: open weights, open data pipelines, and respect for opt-out / PII / licensing constraints.

Key Capabilities & Strengths:

Broad multilingual support (1811 languages, ~40% non-English data)
Long context handling (up to 65 536 tokens)
Reduced verbatim memorization via Goldfish objective
Strong performance among open models on multilingual / reasoning benchmarks
Full transparency and auditability (release of training code, checkpoints, data)

Limitations / Considerations:

Still likely behind top-tier closed models in many niche or high-end tasks.

Details: gpt-oss-120B

gpt-oss-120b is OpenAI’s flagship open-weight reasoning model, released under the Apache 2.0 license in mid-2025. It uses a sparse Mixture-of-Experts Transformer architecture (117B total / ~5.1B active) with MXFP4 quantization. It achieves near-parity with OpenAI's o4-mini on core reasoning benchmarks and delivers good inference performance in terms of latency and throughput.

Key Capabilities & Strengths:

Strong reasoning and benchmark performance (near parity with o4-mini) .
Large context capacity (128 k tokens) for long documents and workflows.
Native agentic features: function-calling (Note: we are working on enabling that) and structured output
Efficiency gains via sparse activation and quantization

Limitations / Considerations:

Sparse scaling doesn’t always yield linear gains; in some cases, the smaller 20B variant competes or outperforms in benchmarks.
Weaker performance in multilingual or niche domain areas.

Deprecated models

None.

zuletzt geändert am 17.10.2025

Dieses Werk ist lizenziert unter einer Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Deutschland Lizenz