this post was submitted on 23 Jun 2023
17 points (94.7% liked)
Experienced Devs
3956 readers
1 users here now
A community for discussion amongst professional software developers.
Posts should be relevant to those well into their careers.
For those looking to break into the industry, are hustling for their first job, or have just started their career and are looking for advice, check out:
- Logo base by Delapouite under CC BY 3.0 with modifications to add a gradient
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
If the files are not going to be changing much, then what is typically done is to use a CDN service (e.g. Cloudflare, Akamai, Fastly). The idea is you have an "origin" which could be any old server which serves your files over HTTP (even a VPS running nginx). The CDN is configured to proxy requests to the origin, building up a cache of the files it serves. The CDN can serve files from cache on their own (very large) infrastructure. See also What is a CDN?
So I got curious and wondered how HuggingFace hosts their files. It’s AWS CloudFront:
That's true, but I just checked a few CDNs and you won't find one for less than $.01/GB. The lowest I found was $.03/GB.
To keep costs down and depending on how much you want to get your hands dirty, you could start investigating renting dedicated servers. Some hosting providers offer unmetered network connectivity. Here's something from OVH: https://www.ovhcloud.com/en/bare-metal/rise/rise-stor-1/
And hey, depending on how grassroots the project is, there's always bittorrent! ;)
I was considering this. The hosting provider we use for model training runs doesn't charge for ingress/egress. Their storage costs would eat us alive though haha. OVH looks much more promising.