I'm one of the creators of the Beaker browser[1] and the reason we use Dat is that as a p2p protocol, it offers a lot of neat properties, including making datasets more resilient. As long as one peer on the network is hosting a dataset, it will be reachable, even if the original author has stopped hosting it.
I won't speak authoritatively on behalf of the Dat team, but I believe one of their goals is to make it difficult for public scientific datasets to be lost, and data living on a centralized server is particularly vulnerable to that.
Because Dat is just a protocol, decentralization is a choice. For quick, ephemeral exchanges direct P2P works brilliantly. For longer lived data sets, sharing it with a (commercial) mirror might make sense. Or perhaps you host it yourself. The beauty is that you, as a user of the protocol, get to decide what works best for you.
We have a few approaches to the disappearing data.
First, we are working with libraries, universities, or other groups with large amounts of storage/bandwidth. They'd help provide hosting for datasets used inside their institutes or other essential datasets.
Second, we started to work on at-home data hosting with Project Svalbard[1]. This is kind of a SETI@home idea where people could donate server space at home to help backup "unhealthy" data (data that doesn't have many peers).
Finally, for "published" data (such as data on Zenodo or Dataverse), we can use those sites as a permanent HTTP peer. So if no data is available over p2p sites then you can get it directly from the published source.
As others said, decentralization is an approach but not a solution. It gives you the flexibility to centralize or distribute data as necessary without being tied to a specific service. But we still need to solve the problem!
That’s something we think about a lot, and decentralization isn’t a silver bullet solution to data loss, but I do think it’s more resilient than what we typically do now.
To counter that, you can take measures to mirror important datasets with a dedicated peer. It requires effort, but it at least makes it much, much harder for example, for a government agency to take down public data without warning.
This may not always be the case, but, so far, blockchains have low throughput and fat datasets that you have to sync. Compared to other databases, they don't perform that well, so if you don't need decentralized strict consensus, a blockchain isn't a good choice.
I won't speak authoritatively on behalf of the Dat team, but I believe one of their goals is to make it difficult for public scientific datasets to be lost, and data living on a centralized server is particularly vulnerable to that.
1. https://github.com/beakerbrowser/beaker