Distribution is a ubiquitous data type in high energy physics (HEP), many other sciences, and our daily life. When the space of distributions is equipped with a suitable metric, previously ad-hoc notions of similarity can now be formulated in a precise way, opening up many new applications with profound theoretical implications. Optimal Transport (OT) is the mathematical theory that provides such well-defined distances between distributions. In this talk, I will introduce the theory of optimal transport, with particular emphasis on how to linearize two special OT distances. I will discuss the recent applications of OT in collider physics and dark matter astrophysics to showcase the power of this novel geometric framework. As the adoption of optimal transport in HEP is still in its infancy, my talk invites everyone to think of other potential use cases in their own research.