Tags: #rl

Jul 18, 2025

End-to-End RL Methods for UAVs

An overview from a broader research perspective

5 min read
Jun 27, 2025

UAV Review

A quick-reference notebook for the UAV domain.

10 min read
- uav
- robotics
- rl
- slam
- sim2real
May 13, 2025

RL Notes (12): DDPG

DDPG

6 min read
- rl
May 12, 2025

RL Notes (11): Imitation Learning

Imitation learning

4 min read
- rl
May 11, 2025

RL Notes (10): Sparse Rewards

Sparse rewards

3 min read
- rl
May 10, 2025

RL Notes (9): Actor-Critic

Actor-Critic

3 min read
- rl
May 9, 2025

RL Notes (8): Q Methods for Continuous Actions

Why DQN struggles with continuous actions, common workarounds, and why many switch to Actor-Critic.

3 min read
- rl
May 8, 2025

RL Notes (7): Advanced DQN

Advanced DQN

3 min read
- rl
May 7, 2025

RL Notes (6): DQN

DQN

3 min read
- rl
May 6, 2025

RL Notes (5): PPO

PPO

4 min read
- rl
May 5, 2025

RL Notes (4): Policy Gradient

Policy Gradient

4 min read
- rl
May 4, 2025

RL Notes (3): From MC and TD(0) to Sarsa / Q-learning

MC vs TD(0) basics, then Sarsa vs Q-learning; on-policy vs off-policy control intuition.

4 min read
- rl
May 3, 2025

RL Notes (2): MDP, MRP, and the Bellman Equation

Markov property, MRP/MDP, Bellman equation, prediction vs control, and DP (policy/value iteration).

5 min read
- rl
May 2, 2025

RL Notes (1): What Does RL Learn?

Intro to RL: what it learns, I/O, exploration vs exploitation, and state vs observation.

7 min read
- rl
May 1, 2025

RL Practice (1): Framework + Q-Learning & Sarsa

A minimal, runnable practice of Q-Learning and Sarsa, epsilon-greedy, an episode-based training loop, and saving/loading.

8 min read
- rl

Tags: #rl

End-to-End RL Methods for UAVs An overview from a broader research perspective 5 min read

UAV Review A quick-reference notebook for the UAV domain. 10 min read

RL Notes (12): DDPG DDPG 6 min read

RL Notes (11): Imitation Learning Imitation learning 4 min read

RL Notes (10): Sparse Rewards Sparse rewards 3 min read

RL Notes (9): Actor-Critic Actor-Critic 3 min read

RL Notes (8): Q Methods for Continuous Actions Why DQN struggles with continuous actions, common workarounds, and why many switch to Actor-Critic. 3 min read

RL Notes (7): Advanced DQN Advanced DQN 3 min read

RL Notes (6): DQN DQN 3 min read

RL Notes (5): PPO PPO 4 min read

RL Notes (4): Policy Gradient Policy Gradient 4 min read

RL Notes (3): From MC and TD(0) to Sarsa / Q-learning MC vs TD(0) basics, then Sarsa vs Q-learning; on-policy vs off-policy control intuition. 4 min read

RL Notes (2): MDP, MRP, and the Bellman Equation Markov property, MRP/MDP, Bellman equation, prediction vs control, and DP (policy/value iteration). 5 min read

RL Notes (1): What Does RL Learn? Intro to RL: what it learns, I/O, exploration vs exploitation, and state vs observation. 7 min read

RL Practice (1): Framework + Q-Learning & Sarsa A minimal, runnable practice of Q-Learning and Sarsa, epsilon-greedy, an episode-based training loop, and saving/loading. 8 min read

End-to-End RL Methods for UAVs

An overview from a broader research perspective

5 min read

UAV Review

A quick-reference notebook for the UAV domain.

10 min read

RL Notes (12): DDPG

DDPG

6 min read

RL Notes (11): Imitation Learning

Imitation learning

4 min read

RL Notes (10): Sparse Rewards

Sparse rewards

3 min read

RL Notes (9): Actor-Critic

Actor-Critic

3 min read

RL Notes (8): Q Methods for Continuous Actions

Why DQN struggles with continuous actions, common workarounds, and why many switch to Actor-Critic.

3 min read

RL Notes (7): Advanced DQN

Advanced DQN

3 min read

RL Notes (6): DQN

DQN

3 min read

RL Notes (5): PPO

PPO

4 min read

RL Notes (4): Policy Gradient

Policy Gradient

4 min read

RL Notes (3): From MC and TD(0) to Sarsa / Q-learning

MC vs TD(0) basics, then Sarsa vs Q-learning; on-policy vs off-policy control intuition.

4 min read

RL Notes (2): MDP, MRP, and the Bellman Equation

Markov property, MRP/MDP, Bellman equation, prediction vs control, and DP (policy/value iteration).

5 min read

RL Notes (1): What Does RL Learn?

Intro to RL: what it learns, I/O, exploration vs exploitation, and state vs observation.

7 min read

RL Practice (1): Framework + Q-Learning & Sarsa

A minimal, runnable practice of Q-Learning and Sarsa, epsilon-greedy, an episode-based training loop, and saving/loading.

8 min read