Neural intel Pod

Episodes (Page 9)

FileFix: Browser to PowerShell Social Engineering

Jun 29, 2025 · 00:26:07

✨ FileFix uses address bar for PowerShell commands.

00:26:07

Reinforcement Learning Under Unmeasured Confounding

Jun 28, 2025 · 01:04:20

✨ Offline RL framework for unmeasured confounding.

01:04:20

Reinforcement Learning for Urban Air Quality Management

Jun 27, 2025 · 01:01:19

✨ DRL optimizes air purification booth placement.

01:01:19

Reinforcement Learning in Non-Stationary Environments

Jun 26, 2025 · 00:31:26

✨ NS-NAC algorithm for non-stationary environments.

00:31:26

Personalized Policy Learning from Heterogeneous Data

Jun 25, 2025 · 00:38:42

✨ Offline RL for personalized policies from diverse data.

00:38:42

Boosting Reinforcement Learning with Human Feedback via SeRA

Jun 23, 2025 · 00:34:05

✨ SeRA mitigates spurious correlations in RLHF.

00:34:05

AXIOM: Active Inference Object-Centric World Models

Jun 22, 2025 · 00:36:09

✨ AXIOM uses object-centric models and active inference.

00:36:09

Entropy and Reinforcement Learning for LLMs

Jun 21, 2025 · 00:31:10

✨ Policy entropy declines rapidly in RL for LLMs, limiting exploration.

00:31:10

FLEX Robot-Agnostic Force-Based Manipulation Learning

Jun 19, 2025 · 00:56:34

✨ Episode: FLEX Robot-Agnostic Force-Based Manipulation Learning

00:56:34

Agent RL Scaling for Mathematical Problem Solving

Jun 18, 2025 · 00:51:16

✨ ZeroTIR trains LLMs to use Python for math via RL.

00:51:16

Beyond Reward: Limits of RL in LLM Reasoning

Jun 17, 2025 · 00:39:57

✨ RLVR may not fundamentally improve LLM reasoning beyond base models.

00:39:57

Reward Model Variance in RLHF

Jun 15, 2025 · 00:50:58

✨ Reward model quality, not just accuracy, impacts RLHF efficiency.

00:50:58

Power Grid Topological Control with Graph Reinforcement Learning

Jun 14, 2025 · 00:57:47

✨ Graph RL optimizes power grid control with masked actions.

Graph Reinforcement Learning

00:57:47

Decentralized RL for Multi-Resource Allocation via Dynamic Cluster Agreements

Jun 13, 2025 · 00:52:32

✨ LGTC-IPPO uses dynamic cluster agreements for decentralized resource allocation.

00:52:32

Reinforcement Learning for Humanoid Dexterous Manipulation

Jun 12, 2025 · 00:42:03

✨ RL enables humanoid robots for dexterous manipulation using vision.

00:42:03

µCODE: Code Generation with Single-Step Rewards

Jun 11, 2025 · 00:50:32

✨ µCODE generates code iteratively using single-step execution rewards.

00:50:32

Confidence-Reward Preference Optimization for Machine Translation

Jun 10, 2025 · 00:55:38

✨ CRPO improves machine translation data selection for LLMs.

00:55:38

Personalized Preference Learning with MiCRo

Jun 9, 2025 · 00:47:37

✨ MiCRo framework learns diverse human preferences for LLMs

00:47:37

ProRL Expands LLM Reasoning Boundaries

Jun 8, 2025 · 00:41:43

✨ ProRL enhances LLM reasoning with KL divergence control

00:41:43

Open CaptchaWorld: Benchmarking MLLM Agents

Jun 7, 2025 · 00:12:43

✨ Open CaptchaWorld benchmarks multimodal AI agents

00:12:43

← Prev 1 2 3 … 8 9 10 … 18 19 Next →

Episodes (Page 9)

Track New Episodes & Guest Appearances