I have been in and around podcasting for the past 10 year. In the modern age of youtube and soundcloud I am still surprised that hosting for podcasting is still a thing that people have to and actually need to pay for.
Even though the file sizes for audio podcasts are literally a faction (~1/10) of a youtube video to host a podcast one must pay a fee of $10-15 a month to a hosting provider like buzzsprout and libsyn. They provide a decent service, podcasting is a very peaky business. Usually you’ll get a lot of downloads very tightly around the drop time of your podcast episodes and dribs and drabs in between. Basic website hosting services, like dreamhost, which you can absolutely host mp3s on, would clag under the type of operation you see for a typical podcast.
I have seen problems with the long term viability of podcast hosting, if you stop paying, your podcast goes down. I know of a lot of podcasts that were a labour of love at the time, but the producers decided they had enough or had said what they wanted to say. They stopped their podcast, they stopped paying their hosting provider, their podcast can not be found any more.
With the current hosting providers there is no legacy option. The long term success of the podcast might not be anything, but the historical value of the content is essentially lost for ever, and that is it! The Internet Archive and their Wayback Machine has saved a lot of recent history of websites, but they have not been backing up the audio file (mp3s etc) that make up a podcast.
As an engineer, I am still amazed that bittorrent has been around 20 years (released in 2001, I am not kidding!) but it has never been married up with podcasting, the type of success one likes to see with a podcast, that demands a hosting provider, almost spoons into the bittorrent protocol, where popular means more and faster, but also as long as there is at least one seed, it will always be available. Bittorrent has been demonized due to it connection with content infringement, however the technology is agnostic of the copyright of the content, I have seen it used quite legitimately in the distribution of linux builds like ubuntu.
As to podcasts, in the main, the producer is the owner of the content and as such should have the ability to place it where ever they deem fit (I will not get into the case where the content they produce itself contains infringing content, that is a separate issue, I’ll come back to another time, but certainly muddies the water)
In the past few months I have been looking at podcasting from an engineers eye and still amazed at how much has not changed in the last 15 years. I don’t know where I heard of the podcast index guys first, but what they were talking about and helping to improve the underlying elements of podcasting really tied up with where and what I was thinking. In one of their podcast I heard mention of a new distributed protocol called IPFS.
The InterPlanetary File System (IPFS) is a protocol and peer-to-peer network for storing and sharing data in a distributed file system. IPFS uses content-addressing to uniquely identify each file in a global namespace connecting all computing devices.
I was at first intrigued, then investigated it and then decided to see if I could take my exiting website and see if I would host the whole thing under IPFS.
And I did, here it is:
It is almost too good, it looks and acts exactly like a website, but everything from the webpages to the underlying mp3 and the rss feeds are being served over IPFS.
Even if you are not interested in the podcasting portion, this is nearly all that is required to get a basic website up and running under IPFS.
So you do you do it
- A pinning services, I used Pinata, which allows you to host up 1GB free and pay beyond that, (seems reasonable so far)
- A static site version of your website
- A domain name (or even a subdomain will do) which the ability to change some records, I was using cloudflare before this, but found out they are using also providing a full read-only IPFS gateway.
Pinning service 1 – Content
Upload all the media content you want to used to IPFS, I am using pinata, but it is similar for other services.
(If you have multiple files, I still recommend uploading them individually, not by folder, it does not seem possible to append new files to folders later without all your links changing)
Once uploaded you will get a result with a CID, Content Identifier, which is what is used to identify your file on the network
This is the http resolved address to the file denoted by that CID:
or the cloudflare read-only version:
(The “?filename=goc24.mp3” is not required to absolutely work as a file, but some podcasting tools parse the end of the filename used to detect file type, so that is why they are present here.)
Once all your media files are uploaded, record all the respective links, you need to list these explicitly inside your static site.
There are a number of ways of creating a static site. Jekyll is used a lot in github, there is a python version called pelican (that’ll be another post) or you can just write straight html, good luck.
In my case I already had a full website up and running under wordpress:
Used the wordpress plugin, Simply Static, I was able to do a full static dump of my website, which can be seen here:
This is not a bad speed up in general. Most podcast websites are fairly basic, not much user interaction, all the things like commenting etc have moved on to facebook and the likes many years ago, so having a dynamic, MySQL driven website isn’t as needed as it used to be.
However this is just the same website with all the mp3 links going back to my hosting provider, which in this case is buzzsprout. Now it is time for a bit of engineering.
Going into every static file created, I wrote a script that search for all the mp3s (which is the only media file I am using) and replace it with its IPFS link equivalent. I used the cloudflare link, but one could use the pinata link if required. However if I was to move on from pinata to another pinning service, the cloudflare link will always work, the pinata one might not, as a result I use the cloudflare links to IPFS content for the remainder of my website work.
This resulted in this wesite:
This is still a conventionally hosted website on Dreamhost, just all the mp3 are now IPFS.
I also had to remap the RSS links to an xml file. A static site (from what I can tell) cannot serve a xml file from a straight path.
On my website I was able to take the full content of this site, as a folder, and upload this folder to Pinata:
Pinning Service 2 – Website
Very similar to the previous pinning, but upload the full folder containing the website, once done it will looks like this:
You can already see if the website is hosted on IPFS correct, taking the CID: Qme8cWySPh8KXfXesZi4Ab2sQ6te388kZAEdEzTApwzWXk:
Almost there. However, every time you want to change your website, you have to upload a new folder, which results in a brand new different CID.
Domain name mapping
To show the IPFS content as a hosted domain, one needs to be able to add a CNAME and a TXT record against the subdomain (or domain) you want to use.
In this case I set the CNAME record, ipfs, to the path www.cloudflare-ipfs.com.
The TXT record then needs to point to the ipfs sublink that points to the head of your current site.
Name: "_dnslink.ipfs" Content: "dnslink=/ipfs/Qme8cWySPh8KXfXesZi4Ab2sQ6te388kZAEdEzTApwzWXk"
Once this has been set and propagated through the internet (give it 30 minutes), go to the new path domain and you should see a the full website hosted.
If you want to subsequently change your website, you need to go back the the static file generation portion, reupload the new site to your pinning service, and get the new CID. This new CID need to be changed on your domain name mapping service, the TXT record content, for the new site to be picked up. This I feel is one of the more annoying aspects of the flow I have described so far.
Issues I have seen with IPFS so far is the random access case. If you have a file (mp3) that might not have been accessed in a while, we are talking weeks here, it doesn’t appear immediately. I have seen cache misses of content take up to 60 seconds to be serviced. Now in the case of downloaded content, that isn’t that big a deal, however with streaming becoming more and more popular a 60 second hit might be enough for a punter to move on to the next thing.
However, as long as it is pinned somewhere, it will always be found.
Also this is just IPFS hosting. When people hit the website and try and get files, particularly the mp3s, they are coming to them as HTTP downloads. That is one way, so that download is not available to be fed back into the IPFS network. The IPFS to HTTP translation happens somewhere between pinata and cloudflare in this model. That is OK for now and there is probably not a single podcaster or webplayer that can take a full ipfs:// path name, use it and then make the content available back into the network. As of this writing brave was the only browser that can even use ipfs path names. I hope if this protocol takes off and, related to podcasting, to adjust the form of the website to support it. Small moves Spark, small moves!
That’s it, that is how I got up and running the first time, any comments or improvements are greatly received.
You can find a list of other IPFS gateways listed here