
What we’re about
We want to bring people together who are interested in AI and Machine Learning. At our meetups, we have:
- Networking
- Talks
- Fireside chats
- Knowledge exchange
- Applications of AI and Machine Learning
We organize our meetups every other month and due to current restrictions, we can only host virtual events.
We are always looking for innovative and inspiring speakers. If you know somebody who would be an excellent fit for our meetup, we would highly appreciate if you help us and recommend this speaker to us. To recommend a speaker for CAIML, please fill out this form.
Learn more about the organizers:
Upcoming events (4+)
See all- CAIML #38lise GmbH, Köln
CAIML #38 is going to happen on September 9, 2025, at lise GmbH.
We will have two talks with additional time for networking.
Talk 1: Tomaz Bratanic (Graph ML and GenAI research at Neo4j): Agentic GraphRAG with MCP servers
This talk explores design patterns for integrating graph memory into agentic workflows, discusses trade-offs between retrieval accuracy and computational efficiency, and highlights how persistent knowledge unlocks more capable, personalized, and trustworthy AI systems.
Talk 2: Pablo Iyu Guerrero (AI Inference Engineer at Aleph Alpha) and Lukas Blübaum (AI Engineer at Aleph Alpha): Tokenizer-free language model inference
Traditional Large Language Models rely heavily on large, predefined tokenizers (e.g., 128k+ vocabularies), introducing limitations in handling diverse character sets, rare words, and dynamic linguistic structures. This talk presents a different approach to language model inference that eliminates the need for conventional large-vocabulary tokenizers. The system operates with a core vocabulary of only 256 byte values, processing text at the most fundamental level. It employs a three-part architecture: byte-level encoder and decoder models handle character sequence processing, while a larger latent transformer operates on higher-level representations. The interface between these stages involves dynamically creating "patch embeddings", guided by word boundaries or entropy measures. This talk will first introduce the intricacies of this byte-to-patch transformer architecture. Subsequently, we will focus on the significant engineering challenges encountered in building an efficient inference pipeline, specifically coordinating the three models, managing their CUDA graphs, and handling their respective KV caches.
⚠️ Please note: ⚠️ All attendees are additionally required to register here. lise GmbH mandates that every attendee must be registered in order to participate in the event.
We will share an agenda soon. See you in September 🤖