blogposts

Blog Posts

Steering LLMs' Behavior with Concept Activation Vectors

Concept activation vectors have been shown to take effects in safety concepts, efficiently and effectively guiding a considerable number of open-source large language models (LLMs) to respond positively to malicious instructions. In this blog, we aim to explore the capability boundaries of concept activation vectors in guiding various behaviors of LLMs through more extensive experiments. Our experiments show that this technique can transfer the text style at a low cost, but it is powerless to deal with short factual knowledge.

21 min read · May 7, 2025

2025
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?

Motivated by reducing the computational and storage costs of LLMs, model compression and KV cache compression have attracted much attention from researchers. However, current methods predominantly emphasize maintaining the performance of compressed LLMs, as measured by perplexity or simple accuracy on tasks of common sense knowledge QA and basic arithmetic reasoning. In this blog, we present a brief review of recent advancements in LLMs related to retrieval-augmented generation, multi-step reasoning, external tools, and computational expressivity, all of which substantially enhance LLM performance. Then, we propose a lottery LLM hypothesis suggesting that for a given LLM and task, there exists a smaller lottery LLM capable of producing the same performance as the original LLM with the assistance of multi-step reasoning and external tools. Based on the review of current progress in LLMs, we discuss and summarize the essential capabilities that the lottery LLM and KV cache compression must possess, which are currently overlooked in existing methods.

24 min read · April 28, 2025

2025
Understanding Methods for Scalable MCTS

Monte Carlo Tree Search (MCTS) is a versatile algorithm widely used for intelligent decision-making in complex, high-dimensional environments. While MCTS inherently improves with more compute, real-world applications often demand rapid decision-making under strict inference-time constraints. This blog post explores scalable parallelization strategies for MCTS, covering classical methods (leaf, root, and tree parallelism) and advanced distributed approaches—including virtual loss, transposition-driven scheduling, and distributed depth-first scheduling. By examining the practical trade-offs and performance implications of each method, we identify effective techniques for achieving high-throughput, low-latency planning—critical for applications like autonomous vehicles, emergency response systems, and real-time trading.

27 min read · April 28, 2025

2025
Understanding Model Calibration - A gentle introduction and visual exploration of calibration and the expected calibration error (ECE)

To be considered reliable, a model must be calibrated so that its confidence in each decision closely reflects its true outcome. In this blogpost we'll take a look at the most commonly used definition for calibration and then dive into a frequently used evaluation measure for model calibration. We'll then cover some of the drawbacks of this measure and how these surfaced the need for additional notions of calibration, which require their own new evaluation measures. This post is not intended to be an in-depth dissection of all works on calibration, nor does it focus on how to calibrate models. Instead, it is meant to provide a gentle introduction to the different notions and their evaluation measures as well as to re-highlight some issues with a measure that is still widely used to evaluate calibration.

24 min read · April 28, 2025

2025
“I Am the One and Only, Your Cyber BFF”: Understanding the Impact of GenAI Requires Understanding the Impact of Anthropomorphic AI

State-of-the-art generative AI (GenAI) systems are increasingly prone to anthropomorphic behaviors, i.e., to generating outputs that are perceived to be human-like. While this has led to scholars increasingly raising concerns about possible negative impacts such anthropomorphic AI systems can give rise to, anthropomorphism in AI development, deployment, and use remains vastly overlooked, understudied, and under-specified. In this blog post, we argue that we cannot thoroughly understand the impact of generative AI without understanding the impact of anthropomorphic AI, and outline a call to action.

16 min read · April 28, 2025

2025

blogposts

Blog Posts

Steering LLMs' Behavior with Concept Activation Vectors

The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?

Understanding Methods for Scalable MCTS

Understanding Model Calibration - A gentle introduction and visual exploration of calibration and the expected calibration error (ECE)

“I Am the One and Only, Your Cyber BFF”: Understanding the Impact of GenAI Requires Understanding the Impact of Anthropomorphic AI