Understanding LLM Hallucinations
An exploration of why language models generate false information and what we can do about it.
✦ Exploring Life, RL, model alignment, and diffusion models ✦
I'm a researcher exploring the intersection of AI safety and capability. My work focuses on understanding and mitigating LLM hallucinations, improving model alignment, and advancing diffusion model techniques.
This blog documents my research, projects, and daily observations from the frontiers of AI development. Join me on this journey through the complexities of artificial intelligence.
An exploration of why language models generate false information and what we can do about it.
A deep dive into how diffusion models work and their applications in AI generation.
The technical and philosophical challenges in aligning AI systems with human values.
Building a system to detect and flag potential hallucinations in LLM outputs.
Visualizing alignment metrics across different model architectures.
Developing RL agents that can adapt to dynamic market conditions for optimized trading strategies.
The gap between capability and alignment is growing faster than we anticipated.
🕐 Nov 9Interesting paper on constitutional AI - game changer for scalable oversight?
🕐 Nov 8Spent the day debugging a diffusion model. The loss curves never lie.
🕐 Nov 7