Dive site fauna analytics Part 1
- Aleksandr Kolmakov
- Aug 7
- 2 min read
Coral reefs are among the most beautiful and diverse ecosystems in the world. They host so many different living organisms that it can be really hard to keep track or even remember everything that was seen during just one dive.

While regularly diving in UAE in 2024 I was a witness to an infestation:
Crown Of Thorns (COTS for short) outbreak.
Meet Acanthasteridae family:

These starfish are not just highly venomous; they love to prey on coral.
On a healthy balanced reef there may be one or two crown of thorns per hectare. Their eating rate aligns with the reef’s growth speed, favoring faster growing corals and promoting reef diversity.
Problems start when the number of Crown of Thorns Starfish (COTS) exceeds the corals’ growth rate. These starfish release millions of eggs, leading to potential outbreaks. After six months, juvenile COTS switch from an algae diet to consuming corals. Dwindling numbers of their natural predators like the humphead wrasse and triton snail worsen the situation. A COTS outbreak can devastate large areas of coral reefs, as evidenced by an eight-year outbreak on the Great Barrier Reef in the 1970s, which destroyed or damaged hundreds of reefs.
By this point you are asking — how data engineering can help with any of this ?
This gave me an idea: to make a tool that not only provides access to biodiversity data on divesites, but also hopefully answers eternal question for every diver — what are my chances to see a whale shark?
Lets write a reusable data pipeline to dive deep into scientific occurrence data and shine some light on how fauna of the local divesites is structured. This will allow us to observe species distributions and how they changes in them can signal potential problems.
A variety of open sources are available to us, such as:
Lets formulate high level plan, that somehow consists of acronyms only:

My plan
Obtain data from occurrence databases
Filter it through WoRMS taxa to leave only marine species
Mark species that are invasive or redlisted via IUCN lists
Only leave species that were spotted in divesite proximity
Somehow visualize what we found (unclear for now)
Profit!
Idea is complete. Lets start building.


Comments