Mouse Picking in WebGL: Color Picking vs Ray Casting

System internals

For those interested in system internals, the following describes the object selection techniques implemented in the current version.

Mouse picking is the first problem you face when you want to build interaction with 3D content in your application. Fortunately, since 3D graphics as a modern technology is a well-developed field, there are several well-known approaches.

The two primary techniques for mouse picking in WebGL are color-based picking and ray casting. The "right choice" depends on your scene complexity, performance, and whether you need precision or just approximate selection. Many developers make this choice early, and prefer one approach over the other in the long term.

But the “right choice” is often only a temporary decision that is valid under a narrow set of tasks. Real-world applications often combine techniques, and the only practical advice is this: use a mixed approach suitable for current needs. So, that's the concept behind these techniques. You can read more below.

In the WebGL color picking approach, as you know, your device renders pixels and draws all these shapes. When you click or tap on your screen, the client software can obtain the exact coordinates of the event and the color of the pixel you are pointing at. So, I render the scene twice. The first time, each object is drawn with a unique solid color. The second time, it is rendered as you would normally expect to see it. Meanwhile, you notice nothing at all. Then I read the pixel under the mouse and compare it with the unique color assigned to each object. This is how I detect which object in the scene has been clicked. It is computationally low-cost and precise (pixel-perfect).

In the current version, the ray casting method is primarily used to compute ray - mesh triangle intersections. Here is how this approach works.

In the WebGL ray casting technique, you start with the mouse position in screen coordinates, perform the necessary matrix transformations, and obtain the camera position and the direction of the ray in 3D world space. Then, you need access to the data of all scene objects. Since you have access to the vertex positions of your models, you can test intersections on the CPU using the Möller-Trumbore algorithm for each triangle by simply looping over all triangles.

These two main techniques are widely used. You can optimize them by shifting some computations to the GPU using shaders. Regarding the current version of the software, it is keeping the shaders simple for now.

Mouse Picking in WebGL: Color Picking vs. Ray Casting

System internals